Hello, On 06/18/2014 05:34 AM, Sébastien Bourdeauducq wrote: > The work over the past month consisted mostly in refining the language > specifications, and starting to build a rough prototype of the system in > order to get a better overview of areas in need of attention.
Thanks. During the review the consensus appeared to be that we are happy with the design and the way it is proceeding. > DEVICE AND EXPERIMENT MODEL > Experiments derive from the class artiq.language.experiment.Experiment. > They have channels as attributes, that kernel functions can access. > > A special channel is the core device, on which kernels are run. > > To specify channels and parameters with a shorter syntax, experiments > may define a channels class attribute which represents the list of > parameters that the constructor will accept and set as attributes. In later stages most of that will be plug into the parameter and result messaging structure. The only critique one may have with statically defined attributes is the duplicate typing. It may be more convenient to explicitly retrieve attributes/parameters in the kernels by name (something like 'f = get_parameter("spectroscopy_freq")'). Also, possible differentiation between calibration values, constants like latencies and other experiment parameters is something that we need to figure out. > KERNEL FUNCTIONS > Kernel functions are run on the core device, that typically compiles > them and runs them onto the hardware. The simple simulator is a special > core device that does no compilation. That is very useful. The other debugging tool that you have is the Unparser that can be used at intermediate transformation stages. > CALLING FUNCTIONS FROM KERNELS > When calling a kernel from a kernel, the compiler attempts to inline the > callee. This makes delays explicit and allows for statement interleaving > and removal of parallel blocks by the compiler. For many kernels it will be time and space efficient to lower further and extract a large constant [(time, channel, action)...] data array from the list of the "queue_rtio_out(time, channel, action)" calls and then iterate over that. But let's keep that for later. > SOFTCORE CPU EVALUATION > LLVM EVALUATION Mor1kx and companions appear to be a very promising candidate. The currently better ecosystem outweighs the size penalty, IMHO. We discussed a potential need for a FPU in the SoC. Possible use cases would be Bayesian analysis of photon count histograms and PID control loops that we may want to run quickly and without the need to wrap our head around fixed point implementations. Is the estimate of a slow down of a few 100 of a soft-fp over hard-fp reasonable? If soft-fp turns out to be too slow, it looks like the floating point instruction set of OpenRISC, custom FPU acceleration gateware, one of the closed source soft cores with an FPU, or a hard CPU (CPU-FPGA chip combination or separate) would be candidates. Could you give us your opinion on those options? Then, let's start brainstorming hardware options. My list is currently: * SPEC (open hardware, promising community and availability life cycle, many IO, rich mezzanine ecosystem, useful cross-device timing ecosystem, cheap) * Something based on µTCA stuff, either available FPGA cards or custom. A few mostly unsorted items came up. Most of them are relevant at a later stage but I'll write them down anyway: * Need for an underflow flag to taint experiments. Set when the CPU does not keep up with the timing, when computation blocks event generation. * In addition to the underflow flag, for longer compuations or RPCs with unpredictable timing something like this would be needed: flag = False with parallel: with sequential: delay(timeout) flag = True with sequential: for partial_result in do_computation(): if flag: break which might be easier implemented as: with parallel: with sequential: delay(timeout) loopback_out() with sequential: for partial_result in do_computation(): if loopback_in(): break where loopback_{out,in}() a hard timed loopback device. * "a.pulse(f, t1); a.pulse(f, t2)" should be lowered to a single "a.pulse(f, t1 + t2)" at some transformation stage. * the dds device need kernels like "a.phase(pi/2)" to change phase or frequency mid-pulse. The jump would be sandwiched between to pulses which would be joined (as above) or given in a with parallel: a.pulse(..., t1+t2) with sequential: delay(t1) a.phase(...) Robert. _______________________________________________ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq Migen/MiSoC: please use de...@lists.m-labs.hk instead.