Hello,

On 06/18/2014 05:34 AM, Sébastien Bourdeauducq wrote:
> The work over the past month consisted mostly in refining the language
> specifications, and starting to build a rough prototype of the system in
> order to get a better overview of areas in need of attention.

Thanks. During the review the consensus appeared to be that we are happy
with the design and the way it is proceeding.

> DEVICE AND EXPERIMENT MODEL
> Experiments derive from the class artiq.language.experiment.Experiment.
> They have channels as attributes, that kernel functions can access.
> 
> A special channel is the core device, on which kernels are run.
> 
> To specify channels and parameters with a shorter syntax, experiments
> may define a channels class attribute which represents the list of
> parameters that the constructor will accept and set as attributes.

In later stages most of that will be plug into the parameter and result
messaging structure. The only critique one may have with statically
defined attributes is the duplicate typing. It may be more convenient to
explicitly retrieve attributes/parameters in the kernels by name
(something like 'f = get_parameter("spectroscopy_freq")').

Also, possible differentiation between calibration values, constants
like latencies and other experiment parameters is something that we need
to figure out.

> KERNEL FUNCTIONS
> Kernel functions are run on the core device, that typically compiles
> them and runs them onto the hardware. The simple simulator is a special
> core device that does no compilation.

That is very useful. The other debugging tool that you have is the
Unparser that can be used at intermediate transformation stages.

> CALLING FUNCTIONS FROM KERNELS
> When calling a kernel from a kernel, the compiler attempts to inline the
> callee. This makes delays explicit and allows for statement interleaving
> and removal of parallel blocks by the compiler.

For many kernels it will be time and space efficient to lower further
and extract a large constant [(time, channel, action)...] data array
from the list of the "queue_rtio_out(time, channel, action)" calls and
then iterate over that. But let's keep that for later.

> SOFTCORE CPU EVALUATION
> LLVM EVALUATION

Mor1kx and companions appear to be a very promising candidate. The
currently better ecosystem outweighs the size penalty, IMHO.

We discussed a potential need for a FPU in the SoC. Possible use cases
would be Bayesian analysis of photon count histograms and PID control
loops that we may want to run quickly and without the need to wrap our
head around fixed point implementations. Is the estimate of a slow down
of a few 100 of a soft-fp over hard-fp reasonable?
If soft-fp turns out to be too slow, it looks like the floating point
instruction set of OpenRISC, custom FPU acceleration gateware, one of
the closed source soft cores with an FPU, or a hard CPU (CPU-FPGA chip
combination or separate) would be candidates. Could you give us your
opinion on those options?


Then, let's start brainstorming hardware options. My list is currently:

* SPEC (open hardware, promising community and availability life cycle,
many IO, rich mezzanine ecosystem, useful cross-device timing ecosystem,
cheap)
* Something based on µTCA stuff, either available FPGA cards or custom.


A few mostly unsorted items came up. Most of them are relevant at a
later stage but I'll write them down anyway:

* Need for an underflow flag to taint experiments. Set when the CPU does
not keep up with the timing, when computation blocks event generation.

* In addition to the underflow flag, for longer compuations or RPCs with
unpredictable timing something like this would be needed:

flag = False
with parallel:
        with sequential:
                delay(timeout)
                flag = True
        with sequential:
                for partial_result in do_computation():
                        if flag:
                                break

which might be easier implemented as:

with parallel:
        with sequential:
                delay(timeout)
                loopback_out()
        with sequential:
                for partial_result in do_computation():
                        if loopback_in():
                                break

where loopback_{out,in}() a hard timed loopback device.

* "a.pulse(f, t1); a.pulse(f, t2)" should be lowered to a single
"a.pulse(f, t1 + t2)" at some transformation stage.

* the dds device need kernels like "a.phase(pi/2)" to change phase or
frequency mid-pulse. The jump would be sandwiched between to pulses
which would be joined (as above) or given in a

with parallel:
        a.pulse(..., t1+t2)
        with sequential:
                delay(t1)
                a.phase(...)


Robert.
_______________________________________________
ARTIQ mailing list
https://ssl.serverraum.org/lists/listinfo/artiq
Migen/MiSoC: please use de...@lists.m-labs.hk instead.

Reply via email to