VERIFICATION CODE×
Please check this box to confirm that you have read and that you accept our Terms & Conditions and Privacy Policy

PASSWORD RESET×
PASSWORD RESET×
LOGIN×
SIGNUP×

Please fill all the fields in the form below and submit it to request a 30 days free trial account. Once approved, we will notify you by email with instructions on how to access your free account.

Please note that free accounts are available exclusively to registered companies (will need to provide proof of registration if requested).

Machine Vision Full Turnkey Solutions


If you don't have the time nor the resources to develop your own machine vision design then we have a perfect solution for you.



Our full turnkey embedded machine vision solutions are ready for seamless integration into your system, allowing you to start using them immediately. These modules have been rigorously tested on an AMD Virtex Ultrascale+ FPGA running at 800 MHz, supporting frame rates from 1 FPS to an impressive 10,000 FPS.



Powered by the state-of-the-art YOLO11 model, our modules provide industry-leading object detection and tracking performance for embedded systems.



Please fill out the Signup form and select the TurnKey option to inquire about our turnkey solutions. Our team will contact you within 24 hours.



Comparison of MAC Unit Usage for Different YOLO11 Model Sizes on an AMD Virtex Ultrascale+ FPGA at 800 MHz
Number of ParametersNumber of MAC operationsFrame Per Second (FPS)Number of MAC Units Used
Yolo11n3 M3 G
100915
10006874
10,00062,838
Yolo11s10 M11 G
1002781
100022,774
10,000210,356
Yolo11m21 M34 GCONTACT USCONTACT US
Yolo11l27 M43 GCONTACT USCONTACT US
Yolo11x59 M97 GCONTACT USCONTACT US

How It Works?

  1. Login to siwave.io

    The compiler is hosted on tier1 cloud service providers network and uses secure HTTPS protocol as well as strong authorization and authentication protocols for maximum security.

    Signup to get a no-obligation 30 days free trial account


  2. Load Keras model

    All builtin float and quantized TFLite operations are supported. Get an instant feedback on the validity of the Keras model immediately when it's loaded.


  3. Set system parameters and select target platform

    A super intuitive user interface with the minimal number of options to get going. No need to have any FPGA/ASIC experience or read a user manual to use the compiler, just load the Keras model, set the system parameters (Frequency and Latency) and select target platform to generate an RTL inference in just a couple of minutes. The inference can also be optimized for either frequency or area.

    The compiler generates agnostic RTL that can target any FPGA vendor/device or ASIC implementation

    System Parameters
    System Parameters

  4. Download IP and integrate into top level system

    Deliverables include the encrypted SystemVerilog inference model, a SystemVerilog testbench and a datasheet. A Quartus Platform Designer ready IP component and a Vivado IP core are also provided to facilitate the integration of the inference in a top level system. We also offer optional system design and integration service for generating a complete FPGA or ASIC system.

    The bus interface uses a simple ready/valid handshake protocol that can handle burst sizes up to 4G bytes

    Platform Designer IP
    Quartus Platorm Designer IP
    Vivado IP core
    Vivado IP Core

What sets us apart from the competition


3X Smaller Inferences


The entire inference is implemented in platform agnostic RTL which means there is no runtime environment and no microcode to execute on a microprocessor which translates to a considerable reduction in inference size, in addition, to a much faster deployment and less maintenance effort.


2X Higher Performance


The entire inference is implemented in RTL and, therefore, there is no time consuming context switching between ML hardware and host processor to execute unsupported operations. Siwave's compiler optimizes each operation based on the system frequency and latency instead of slicing tensors to fit operations into a fixed matrix which translates to a considerable speedup in inference execution time compared to our competitors when using same frequency or, alternatively, the potential to run the inference at higher frequency which will, consequently, lead to a lower latency as demonstrated in the table below.



Comparing AMD/Xilinx DPUCZDX8G B512 v3.0 (1 core/1 ALU/No Softmax) with Siwave RTL convolution module on a Zynq UltraScale+ MPSoCs device (xczu9eg-ffvb1156-2-i)
LUTLUTRAMFFBRAMDSPMax CLKMax Ops/cycleGOPS
AMD DPUCZDX8G (B512)8%2%6%8%4%325 MHz512166.4
SIWAVE CONVOLUTION RTL MODULE3%0%1%8%1%500 MHz512256