Re: [Kwant] A question about the representation of the wavefunction

2019-05-17 Thread Adel Kara Slimane
I am suggesting a possible improvement (at least from my current point 
of view) to how 'operators' are written in Kwant, by 'operators' I mean 
the classes "Current", "Density" and "Source", written in the file 
"kwant/opreator.pyx". So this proposal will not affect end-users of 
Kwant, only Kwant developers and code contributors: this is of interest 
to my group because we are currently working on implementing additional 
operators for tkwant, similar to "Current", "Density" and "Source".


My suggestion is to be able write the "Current" class as in the 
attached file "suggestion.pyx" instead of what's currently done (an 
extract is in "current-implementation.pyx"), while still being as fast.


The same files are available here: 
 (with 
code highlighting)


Adel.

On Fri, 17 May, 2019 at 5:07 PM, Christoph Groth 
 wrote:

Adel Kara Slimane wrote :


 Joseph Weson wrote:

 > I read through this thread and I'm still not really sure what is
 > being proposed.  In the first email in the thread it seems as if 
you

 > are proposing a wrapper for wavefunctions so that they can be
 > indexed by site. It seems that you want to be able to write:
 >
 >  psi(i).dagger() * syst.hamiltonian(i, j, params) * psi(j)
 >
 Yes, that is what I want, my end goal is to make writing 
(additional)

 kwant/tkwant operators easier by writing a formula closer to the
 mathematical expression: my group is working on implementing
 additionnal operators to tKwant for energy and heat transport.


I have the impression that we first need to talk about what you would
like to achieve and not get lost in the details.

(Specifically, it seems premature to talk about anything like the
Armadillo library.  This is one of many C++ libraries for linear
algebra.  You seem to be interested about users of Kwant, that is 
people

who write Python scripts and perhaps Cython scripts.  How would a C++
library help here?)

I have the impression that you'd like to be able to write code like
above in Python (or Cython?).  You'd like that the code runs fast and 
is

easy to read (and write).  Is this correct?  Can you please describe
your concrete problem, perhaps even with code examples?


cdef class Current(_LocalOperator):
# ...
def _operate(self, complex[:] out_data, complex[:] bra, complex[:] ket,
 args, operation op, *, params=None):
# prepare onsite matrices and hamiltonians
cdef int unique_onsite = not callable(self.onsite)  

cdef gint[:, :] offsets, norbs = _get_all_orbs(self.where, self._site_ranges)

# MathVector is an "accesser" class
cdef MathVector wf = MathVector(ket, offsets, norbs)

# main loop
cdef gint a b
for w in range(self.where.shape[0]):   
if op == MAT_ELS:
a = self.where[w, 0]
b = self.where[w, 1]

cdef MathMatrix H = MathMatrix(self.syst.hamiltonian(a, b, *args, params=params))  

if not unique_onsite:
cdef MathMatrix M = MathMatrix(self.onsite(a, *args, params=params))   
out_data[w] = (wf[a].dagger() * M * H * wf[b]).imag()  
else:
out_data[w] = (wf[a].dagger() * H * wf[b]).imag()
cdef class Current(_LocalOperator):
# ...
def _operate(self, complex[:] out_data, complex[:] bra, complex[:] ket,
 args, operation op, *, params=None):
# prepare onsite matrices and hamiltonians
cdef int unique_onsite = not callable(self.onsite)
cdef complex[:, :] _tmp_mat
cdef complex *M_a = NULL
cdef complex *H_ab = NULL
cdef BlockSparseMatrix M_a_blocks, H_ab_blocks

if unique_onsite:
_tmp_mat = self.onsite
M_a =  &_tmp_mat[0, 0]
elif self._bound_onsite:
M_a_blocks = self._bound_onsite
else:
M_a_blocks = self._eval_onsites(args, params)

if self._bound_hamiltonian:
H_ab_blocks = self._bound_hamiltonian
else:
H_ab_blocks = self._eval_hamiltonian(args, params)

# main loop
cdef gint a, a_s, a_norbs, b, b_s, b_norbs
cdef gint i, j, k, w
cdef complex tmp
for w in range(self.where.shape[0]):
### get the next hopping's start orbitals and numbers of orbitals
a_s = H_ab_blocks.block_offsets[w, 0]
b_s = H_ab_blocks.block_offsets[w, 1]
a_norbs = H_ab_blocks.block_shapes[w, 0]
b_norbs = H_ab_blocks.block_shapes[w, 1]
### get the next onsite and Hamiltonian matrices
H_ab = H_ab_blocks.get(w)
if not unique_onsite:
M_a = M_a_blocks.get(w)
### do the actual calculation
if op == MAT_ELS:
tmp = 0
for i in 

Re: [Kwant] A question about the representation of the wavefunction

2019-05-17 Thread Adel Kara Slimane

Thank you for your clarifications.

Also it seems that I obtained an intersting result while trying to come 
up with a minimal example to do benchmarks on for representing the 
wavefunction: summing over the values of 
vector>> takes a similar amount of time as 
summing over its flattened counterpart, if the sum over the flat array 
needs to be aware of the orbitals for each site, i.e. doing something 
like this:


   vector> wavefunction;
   vector offsets;

   // ...
   // fill the wavefunction and offsets
   // ...

   complex result = 0;
   for(int site = 0 ; site < offsets.back() ; site++)
   for(int orbital = 0  ; orbital < offsets[site+1] - 
offsets[site] ; orbital++)

   result = wavefunction[offsets[site] + orbital];

I attached the code I wrote that gave me this result. It may be because 
of under-the-hood optimisations of the compiler.


I will answer your question about the "accesser" in my answer to your 
second email.


Adel

On Fri, 17 May, 2019 at 3:53 PM, Christoph Groth 
 wrote:

Adel Kara Slimane a écrit :


 About the memory usage, I don't understand why it would take
 signifcantly more space, but maybe because I first suggested an 
array
 of Python objects instead of compiled objects, which take more 
space ?


A Python list of 1000 tinyarrays of two numbers each takes up much 
more
space than an array (numpy or tinyarray or array from the Python 
stdlib)

because it involves 1000 Python objects that each store three
pointer-sized data fields.  Moreover, these 1000 chunks of memory are
allocated separately and this requires storing additional data and may
lead to additional waste due to holes.

The situation for a vector>> in C++ is not
fundamentally different.  Each vector instance store its size and a
pointer to the allocated memory.  And the memory allocation also
requires memory for metadata and likely causes waste.

The slowness is due toe the fact that more memory needs to be read, 
and
that this memory is not contiguous.  Because of the way L1 and L2 
caches

work, computers are fastest when reading memory contiguously.

 I agree and I am interested by coding the solution you are 
suggesting,

 I would like to suggest further developpement to it: instead of
 returning a si! mple list of 2D arrays without copying anything, we
 use an "accesser" object that would enable matrix products :

 1 It gets initialised with the system "syst"

 2 Calling "accesser.wavefunction(bra, site, orbital)" (or some 
similar
 writing) returns "wavefunction[orbital + offset(site)]" with the 
right

 offset.

 3 The matricial inner product is implemented:
 "accesser.wavefunction(bra, site) *
  accesser.hamiltonian(site) * accesser.wavefunction(ket, site)" 
would

  be a valid expression (or something similar, with the same idea to
  have an apparent matrix-vector or matrix-martix product).


So you suggest introducing a separate object, the "accesser", that is
initialized with a system?

First of all, why a separate object?  After all, the system itself, 
that

already knows its 'site_ranges' could take over this work.

I will reply in another message for the rest.

Christoph


#include 
#include 
#include 
#include 

using namespace std;

typedef unsigned long ulong;

complex sum(const vector>> )
{
complex result1 = 0;

for(ulong site = 0 ; site < wf.size() ; site++)
{
for(ulong orb = 0 ; orb < wf[site].size() ; orb++)
{
result1 += wf[site][orb];
}
}

return result1;
}

complex sumFlat(complex wf[], ulong size)
{
complex result2 = 0;

for(ulong i = 0 ; i < size ; i++)
{
result2 += wf[i];
}

return result2;
}

complex sumFlatOffsets(complex wf[], const vector )
{
complex result2 = 0;
ulong orbitalNumber;

for(ulong site = 0 ; site < offsets.size() - 1  ; site++)
{
orbitalNumber = offsets[site + 1] - offsets[site];
for(ulong orbital = 0 ; orbital < orbitalNumber ; orbital++)
result2 += wf[offsets[site] + orbital];
}

return result2;
}


void benchmark()
{
ulong sitesNum = 1E7;

default_random_engine generator;
uniform_int_distribution intDistribution(1,6);

vector orbitalsPerSite;

cout << "---" << endl;
cout << "Number of sites: " << sitesNum << endl;
cout << "Deciding how many orbitals each site has, random number between 1 and 6" << endl;

for(ulong site = 0 ; site < sitesNum ; site++)
orbitalsPerSite.push_back(intDistribution(generator));

cout << "---" << endl;
cout << "Creating wavefunction as a vector>: wf[site][orbital]" << endl;
cout << "the the size of wf[site] given with the previously generated random sizes" << endl;
cout << "Filling wavefunction with random values" << endl;

uniform_real_distribution realDistribution(-1, 1);
vector>> wavefunction1;
wavefunction1.resize(sitesNum);

//imaginary number
complex i(0, 1);


Re: [Kwant] A question about the representation of the wavefunction

2019-05-17 Thread Christoph Groth
Adel Kara Slimane a écrit :

> About the memory usage, I don't understand why it would take
> signifcantly more space, but maybe because I first suggested an array
> of Python objects instead of compiled objects, which take more space ?

A Python list of 1000 tinyarrays of two numbers each takes up much more
space than an array (numpy or tinyarray or array from the Python stdlib)
because it involves 1000 Python objects that each store three
pointer-sized data fields.  Moreover, these 1000 chunks of memory are
allocated separately and this requires storing additional data and may
lead to additional waste due to holes.

The situation for a vector>> in C++ is not
fundamentally different.  Each vector instance store its size and a
pointer to the allocated memory.  And the memory allocation also
requires memory for metadata and likely causes waste.

The slowness is due toe the fact that more memory needs to be read, and
that this memory is not contiguous.  Because of the way L1 and L2 caches
work, computers are fastest when reading memory contiguously.

> I agree and I am interested by coding the solution you are suggesting,
> I would like to suggest further developpement to it: instead of
> returning a si! mple list of 2D arrays without copying anything, we
> use an "accesser" object that would enable matrix products :
>
> 1 It gets initialised with the system "syst"
>
> 2 Calling "accesser.wavefunction(bra, site, orbital)" (or some similar
> writing) returns "wavefunction[orbital + offset(site)]" with the right
> offset.
>
> 3 The matricial inner product is implemented:
> "accesser.wavefunction(bra, site) *
>  accesser.hamiltonian(site) * accesser.wavefunction(ket, site)" would
>  be a valid expression (or something similar, with the same idea to
>  have an apparent matrix-vector or matrix-martix product).

So you suggest introducing a separate object, the "accesser", that is
initialized with a system?

First of all, why a separate object?  After all, the system itself, that
already knows its 'site_ranges' could take over this work.

I will reply in another message for the rest.

Christoph


smime.p7s
Description: S/MIME cryptographic signature


Re: [Kwant] A question about the representation of the wavefunction

2019-05-16 Thread Adel Kara Slimane

Hello Joseph,

I apologise for not being clear enough, I will try to remedy to this in 
this email. And I understand that it is very plausible that all what I 
may suggest have been thought of before, and maybe put aside for 
various good reasons.
I read through this thread and I'm still not really sure what is 
being proposed. In the first email in the thread it seems as if you 
are proposing a wrapper for wavefunctions so that they can be indexed 
by site. It seems that you want to be able to write:


   psi(i).dagger() * syst.hamiltonian(i, j, params) * psi(j)

Yes, that is what I want, my end goal is to make writing (additional) 
kwant/tkwant operators easier by writing a formula closer to the 
mathematical expression: my group is working on implementing 
additionnal operators to tKwant for energy and heat transport.


To implement what I am suggesting, to me three things are required:
Have a wrapper to make the wavefunctions accessible by site, and 
implement the "dagger()" method.Have the hamiltonian accessible by 
site, which is the case, but slow ? with "syst.hamiltonian()", it seems 
that it's indirectly used by the class "Current" for example, so using 
"syst.hamiltonian()" directly shouldn't be slower.Operator overload: to 
be able to make Cython understand what is the "*" operator between 
"psi(i).dagger()" and "syst.hamiltonian(i, j, params)"
I understood that 1. was settled and could be done: no kwant code is 
changed and only an additionnal wrapper is implemented, for use when it 
comes in handy, for example in writing kwant operators. Is that what 
was rejected and what you tried implementing ? @Groth seemed to be okay 
with it in his first mail answer (on 10-05, last paragraph), or I may 
have misunderstood him...


It's in trying to come up with a concrete proposal on how to implement 
3. while still being as fast as the current implemetation of for 
example "_operate" from  "Current" class : nested for loops on 
contiguous arrays, everything being cythonised/compiled (the only 
slow/python part seems to be to evaluate the hamiltonian with 
"params").  The first suggestion I came up with is to use the C++ 
library Armadillo with its already implemented matrix-vector products 
along with its matrix and vector classes. I can see three good points 
about it:
Its vector and matrix classes can be instanciated with pointers on 
already existing data: i.e. the existing wavefunction and hamiltonian 
in kwant. So no memory copy.I assume that writing "psi(i).dagger() * 
syst.hamiltonian(i, j, params) * psi(j)"  with Armadillo objects, in a 
Cython file, will get compiled. Thus I expect that the performance 
would be similar to the current approach with nested for loops.It may 
be possible to take advantage of the libs optimisations ?
One, maybe huge, negative point is that it would add an additionnal 
dependency to Kwant for something that can be implemented by hand in a 
relatively short time (matrix-vector products). But maybe we can use 
its optimisations and not bother with reimplementing them: for example 
apparently writing "trace(A * B * C * D)" in this lib will for example 
be optimised so that it doesn't do the naive approach of computing the 
matrix product first, then take the trace of the result, which will 
make writing  "trace(A * B * C * D)" still as fast as its 
nested-for-loops implementation.


I hope I could make my thinking clearer.

Thank you,

Adel



On Thu, 16 May, 2019 at 12:49 PM, Joseph Weston 
 wrote:

Hi Adel,

I read through this thread and I'm still not really sure what is 
being proposed. In the first email in the thread it seems as if you 
are proposing a wrapper for wavefunctions so that they can be indexed 
by site. It seems that you want to be able to write:



   psi(i).dagger() * syst.hamiltonian(i, j, params) * psi(j)


This is a fine proposal. I could imagine that a simple implementation 
for such a wrapper would not need to use any exotic storage formats, 
C-level accessors or anything: finalized systems already contains an 
attribute 'site_ranges' that allows for constant-time access to the 
orbital offset of a site. I actually proposed such a wrapper some 
time ago, but my proposal was rejected [1] because it did not give 
vectorized access to the wavefunction elements. It is not obvious to 
me how to give vectorized access without copying.


Then after this proposal the thread seems to start discussing 
_LocalOperator (which is an internal detail that was never meant to 
be extended or user-facing) and lots of details about implementation 
strategies in C++ for *something*, but it is not clear to me what the 
end goal is. Perhaps you want to be able to more easily express 
arbitrary operators? Or are you proposing a refactoring of the kwant 
operators module to make it more readable? A clarification would be 
very useful for me.


 Thanks,
Joe

[1]: 



On 5/14/19 3:14 PM, Adel Kara Slimane 

Re: [Kwant] A question about the representation of the wavefunction

2019-05-16 Thread Joseph Weston
Hi Adel,

I read through this thread and I'm still not really sure what is being
proposed. In the first email in the thread it seems as if you are
proposing a wrapper for wavefunctions so that they can be indexed by
site. It seems that you want to be able to write:


   psi(i).dagger() * syst.hamiltonian(i, j, params) * psi(j)


This is a fine proposal. I could imagine that a simple implementation
for such a wrapper would not need to use any exotic storage formats,
C-level accessors or anything: finalized systems already contains an
attribute 'site_ranges' that allows for constant-time access to the
orbital offset of a site. I actually proposed such a wrapper some time
ago, but my proposal was rejected [1] because it did not give vectorized
access to the wavefunction elements. It is not obvious to me how to give
vectorized access without copying.

Then after this proposal the thread seems to start discussing
_LocalOperator (which is an internal detail that was never meant to be
extended or user-facing) and lots of details about implementation
strategies in C++ for *something*, but it is not clear to me what the
end goal is. Perhaps you want to be able to more easily express
arbitrary operators? Or are you proposing a refactoring of the kwant
operators module to make it more readable? A clarification would be very
useful for me.

Thanks,

Joe

[1]: https://gitlab.kwant-project.org/kwant/kwant/merge_requests/47



On 5/14/19 3:14 PM, Adel Kara Slimane wrote:

> Hello Christoph,
>
> I have a few additions to make above my previous mail, I hope I am not
> making you waste your time maybe with trivial and unuseful suggestions.
>
> I have found a C++ library, Armadillo ,
> that can do the trick for implementing a fast accesser, with built-in
> matrix-vector products.
>
> Let's assume we have the wavefunction and the hamiltonian on flat
> contiguous arrays. This library has Row, Column, Matrix and
> SparseMatrix object implementations, with inner functions like
> hermitian conjugate or transpose, and with operations between them
> like sum and multiplication. The good thing about it is that one
> initialise such objects with pointers to existing memory, without
> copying, here are the constructors that one could use:
>
> For vectors: 
>
> |vec(ptr_aux_mem, number_of_elements, copy_aux_mem = true, strict
> = false)| 
>
> For matrices :
>
> |mat(ptr_aux_mem, n_rows, n_cols, copy_aux_mem = true, strict =
> false)| 
>
> So one can wrap such objects in Cython and stack allocate them with
> the correct memory pointers and sizes then use them to write the
> wanted operator expression in a compact way. In terms of speed, I
> expect it to be nearly as fast as a "by-hand" impl! ementation, since
> the expression will be cythonised/compiled, thus in the end with
> similar nested for-loops after compilation.
>
> If I understand well, matrices, for example the hamiltonian, don't
> have really a low level representation since "params" need to be
> provided first to create a flat matrix of only complex values, and
> that would be the point of "_eval_hamiltonian()" in the
> "_LocalOperator" class, it returns a BlockSparseMatrix which is
> basically a flat array containing all the matrices of all the needed
> hoppings, given in the "where" list. I have a question about that:
>
> Would it be as fast to compute the hamiltonian little by little while
> calculating the operator's value on each hopping of the "where" list
> ?  To show what I mean exactly, I have a code example
>  (I
> couldn't test it yet unfortunately) using the "tinyarray" pythong
> package. The idea would then be to use Aramadillo objects instead of
> tinyarray, and have the code compilable. This approach would have the
> benefit of not using "_eval_hamiltonian()" nor the "BlockSparseMatrix"
> class, but will need the extra Armadillo wrapping code, and mapping
> the return of "syst.hamiltonian" to a Matrix object.
>
>
> Thank you,
>
> Adel KARA SLIMANE
>
> On Mon, 13 May, 2019 at 4:38 PM, Adel Kara Slimane
>  wrote:
>> Hello Christoph,
>>
>> Thank you for your! detailed answer.
>>
>> I convinced myself with a small C++ benchmark code that summing over
>> the complex values of a vector>> array is more
>> than twice slower than summing over a flat, same size, dynamically
>> allocated array (I attached the code to this mail). About the memory
>> usage, I don't understand why it would take signifcantly more space,
>> but maybe because I first suggested an array of Python objects
>> instead of compiled objects, which take more space ?
>>
>> I agree and I am interested by coding the solution you are
>> suggesting, I would like to suggest further developpement to it:
>> instead of returning a simple list of 2D arrays without copying
>> anything, we use an "accesser" object 

Re: [Kwant] A question about the representation of the wavefunction

2019-05-14 Thread Adel Kara Slimane

Hello Christoph,

I have a few additions to make above my previous mail, I hope I am not 
making you waste your time maybe with trivial and unuseful suggestions.


I have found a C++ library, Armadillo , 
that can do the trick for implementing a fast accesser, with built-in 
matrix-vector products.


Let's assume we have the wavefunction and the hamiltonian on flat 
contiguous arrays. This library has Row, Column, Matrix and 
SparseMatrix object implementations, with inner functions like 
hermitian conjugate or transpose, and with operations between them like 
sum and multiplication. The good thing about it is that one initialise 
such objects with pointers to existing memory, without copying, here 
are the constructors that one could use:


For vectors: 
vec(ptr_aux_mem, number_of_elements, copy_aux_mem = true, strict = 
false)

For matrices :
mat(ptr_aux_mem, n_rows, n_cols, copy_aux_mem = true, strict = false)
So one can wrap such objects in Cython and stack allocate them with the 
correct memory pointers and sizes then use them to write the wanted 
operator expression in a compact way. In terms of speed, I expect it to 
be nearly as fast as a "by-hand" implementation, since the expression 
will be cythonised/compiled, thus in the end with similar nested 
for-loops after compilation.


If I understand well, matrices, for example the hamiltonian, don't have 
really a low level representation since "params" need to be provided 
first to create a flat matrix of only complex values, and that would be 
the point of "_eval_hamiltonian()" in the "_LocalOperator" class, it 
returns a BlockSparseMatrix which is basically a flat array containing 
all the matrices of all the needed hoppings, given in the "where" list. 
I have a question about that:


Would it be as fast to compute the hamiltonian little by little while 
calculating the operator's value on each hopping of the "where" list ?  
To show what I mean exactly, I have a code example 
 (I 
couldn't test it yet unfortunately) using the "tinyarray" pythong 
package. The idea would then be to use Aramadillo objects instead of 
tinyarray, and have the code compilable. This approach would have the 
benefit of not using "_eval_hamiltonian()" nor the "BlockSparseMatrix" 
class, but will need the extra Armadillo wrapping code, and mapping the 
return of "syst.hamiltonian" to a Matrix object.



Thank you,

Adel KARA SLIMANE

On Mon, 13 May, 2019 at 4:38 PM, Adel Kara Slimane 
 wrote:

Hello Christoph,

Thank you for your detailed answer.

I convinced myself with a small C++ benchmark code that summing over 
the complex values of a vector>> array is more 
than twice slower than summing over a flat, same size, dynamically 
allocated array (I attached the code to this mail). About the memory 
usage, I don't understand why it would take signifcantly more space, 
but maybe because I first suggested an array of Python objects 
instead of compiled objects, which take more space ?


I agree and I am interested by coding the solution you are 
suggesting, I would like to suggest further developpement to it: 
instead of returning a simple list of 2D arrays without copying 
anything, we use an "accesser" object that would enable matrix 
products :
It gets initialised with the system "syst"Calling 
"accesser.wavefunction(bra, site, orbital)" (or some similar writing) 
returns "wavefunction[orbital + offset(site)]" with the right 
offset.The matricial inner product is implemented: 
"accesser.wavefunction(bra, site) * accesser.hamiltonian(site) * 
accesser.wavefunction(ket, site)" would be a valid expression (or 
something similar, with the same idea to have an apparent 
matrix-vector or matrix-martix product).

The third bullet point raises for me some questions though:
What do you think of it ?I need to understand how the hamiltionian is 
represented to be able to think of possibilities to implement that, 
but I can't grasp what is the low level representation of the 
hamiltonian when the system is finalised, is it a flat array too ? 
When I look at "_eval_hamiltonian()" in the class "_LocalOperator". I 
see that it uses the return value of "syst.hamiltonian()", then 
converts it into a Python tinyarray.matrix object, then convert it 
once again into a BlockSpareMatrix.


Thank you,

Adel



On Fri, 10 May, 2019 at 6:15 PM, Christoph Groth 
 wrote:

Hi Adel,

Your understanding of the way wave functions (and other observables) 
are

stored is correct.  You are also right that working with low-level
observables is inconvenient and could be improved.  I'll explain how 
the
current situation came about, and offer a possible solution.  As 
always,

we're glad to hear comments or suggestions.

I still think that our original design choice to store everything in 
one
flat array was a good one.  The 

Re: [Kwant] A question about the representation of the wavefunction

2019-05-13 Thread Adel Kara Slimane

Hello Christoph,

Thank you for your detailed answer.

I convinced myself with a small C++ benchmark code that summing over 
the complex values of a vector>> array is more 
than twice slower than summing over a flat, same size, dynamically 
allocated array (I attached the code to this mail). About the memory 
usage, I don't understand why it would take signifcantly more space, 
but maybe because I first suggested an array of Python objects instead 
of compiled objects, which take more space ?


I agree and I am interested by coding the solution you are suggesting, 
I would like to suggest further developpement to it: instead of 
returning a simple list of 2D arrays without copying anything, we use 
an "accesser" object that would enable matrix products :
It gets initialised with the system "syst"Calling 
"accesser.wavefunction(bra, site, orbital)" (or some similar writing) 
returns "wavefunction[orbital + offset(site)]" with the right 
offset.The matricial inner product is implemented: 
"accesser.wavefunction(bra, site) * accesser.hamiltonian(site) * 
accesser.wavefunction(ket, site)" would be a valid expression (or 
something similar, with the same idea to have an apparent matrix-vector 
or matrix-martix product).

The third bullet point raises for me some questions though:
What do you think of it ?I need to understand how the hamiltionian is 
represented to be able to think of possibilities to implement that, but 
I can't grasp what is the low level representation of the hamiltonian 
when the system is finalised, is it a flat array too ? When I look at 
"_eval_hamiltonian()" in the class "_LocalOperator". I see that it uses 
the return value of "syst.hamiltonian()", then converts it into a 
Python tinyarray.matrix object, then convert it once again into a 
BlockSpareMatrix.


Thank you,

Adel



On Fri, 10 May, 2019 at 6:15 PM, Christoph Groth 
 wrote:

Hi Adel,

Your understanding of the way wave functions (and other observables) 
are

stored is correct.  You are also right that working with low-level
observables is inconvenient and could be improved.  I'll explain how 
the
current situation came about, and offer a possible solution.  As 
always,

we're glad to hear comments or suggestions.

I still think that our original design choice to store everything in 
one

flat array was a good one.  The alternatives you mention would require
much more RAM and would be also significantly slower due to decreased
memory locality.

If we were to store a list of tinyarrays that would require storing 
one

Python object (the tinyarray) per site.  For example, if we were to do
this for a system with two orbitals per site, that would mean a memory
cost of 8 + 56 = 64 bytes per site.  With the current approach, we
require two complex numbers, that is 16 bytes per site.  What's more,
accessing this memory is no longer sequential, which makes it even
slower.

(You could object that finalized builders already store O(N) Python
objects, namely the sites and their tags.  But (1) that is not true 
for
general Kwant low-level systems, and (2) we are working on solving 
this

problem for finalized builders as well.)

The other solutions that you mention have similar problems.  What we
could have done is sort the sites according to their number of 
orbitals,

and then for each block of sites store a 2d array of shape (num_sites,
num_orbs).  But back when the low-level format was established, the
number of orbitals of a site was not fixed, and so this could not have
been done.  The reason why the number was not fixed was not so much a
false sense of generality, but rather that we didn't see a good way to
specify the number of orbitals per site.  We only later had the idea 
to

fix the number of orbitals per site family.

(The above solution would have the inconvenience that the wave 
function
would be no longer a simple array, but rather a list of 2d arrays.  
This
exposes more structure, but it also makes things like normalizing a 
wave

function more difficult.)

Here is how you can efficiently realize the above solution with a 
couple

of lines of code:

Check out the attribute 'site_ranges' of Kwant systems [1].  With this
information, you can write a function 'unpack' that takes a system 
and a

numpy array with one entry per orbital (for example a wave function),
and returns a list of 2d arrays, that, without copying, refer to the
data in a way that simplifies working with sites.

If this sounds like a good idea and you write this function, be sure 
to

share it.  We might want to include it into Kwant as a method of
'System'.

Cheers
Christoph

[1] 



#include 
#include 
#include 
#include 
#include 

using namespace std;

int main()
{
unsigned long sitesNum = 1E7, dice_roll = 1, size = 0, offset = 0;
unsigned long repetitions = 100;

complex i(0, 1), result1 = 0, result2 = 0;

default_random_engine generator;
  

Re: [Kwant] A question about the representation of the wavefunction

2019-05-10 Thread Christoph Groth
Hi Adel,

Your understanding of the way wave functions (and other observables) are
stored is correct.  You are also right that working with low-level
observables is inconvenient and could be improved.  I'll explain how the
current situation came about, and offer a possible solution.  As always,
we're glad to hear comments or suggestions.

I still think that our original design choice to store everything in one
flat array was a good one.  The alternatives you mention would require
much more RAM and would be also significantly slower due to decreased
memory locality.

If we were to store a list of tinyarrays that would require storing one
Python object (the tinyarray) per site.  For example, if we were to do
this for a system with two orbitals per site, that would mean a memory
cost of 8 + 56 = 64 bytes per site.  With the current approach, we
require two complex numbers, that is 16 bytes per site.  What's more,
accessing this memory is no longer sequential, which makes it even
slower.

(You could object that finalized builders already store O(N) Python
objects, namely the sites and their tags.  But (1) that is not true for
general Kwant low-level systems, and (2) we are working on solving this
problem for finalized builders as well.)

The other solutions that you mention have similar problems.  What we
could have done is sort the sites according to their number of orbitals,
and then for each block of sites store a 2d array of shape (num_sites,
num_orbs).  But back when the low-level format was established, the
number of orbitals of a site was not fixed, and so this could not have
been done.  The reason why the number was not fixed was not so much a
false sense of generality, but rather that we didn't see a good way to
specify the number of orbitals per site.  We only later had the idea to
fix the number of orbitals per site family.

(The above solution would have the inconvenience that the wave function
would be no longer a simple array, but rather a list of 2d arrays.  This
exposes more structure, but it also makes things like normalizing a wave
function more difficult.)

Here is how you can efficiently realize the above solution with a couple
of lines of code:

Check out the attribute 'site_ranges' of Kwant systems [1].  With this
information, you can write a function 'unpack' that takes a system and a
numpy array with one entry per orbital (for example a wave function),
and returns a list of 2d arrays, that, without copying, refer to the
data in a way that simplifies working with sites.

If this sounds like a good idea and you write this function, be sure to
share it.  We might want to include it into Kwant as a method of
'System'.

Cheers
Christoph

[1] 
https://kwant-project.org/doc/1/reference/generated/kwant.system.System#kwant.system.System


smime.p7s
Description: S/MIME cryptographic signature


[Kwant] A question about the representation of the wavefunction

2019-05-10 Thread Adel Kara Slimane

Hello,


I started recently to look at Kwant from under the hood, and I have a 
question to address you.


My current understanding is that the wavefunction is defined as a 1D 
array of complex values, the orbitals for each site being a contiguous 
chunk in the array, that can be accessed through book-keeping.


What I was expecting was a simple compilable representation that uses 
for example an array of pointers to tinyarrays:


`ta.array[:] wavefunction' or maybe `std::vector wavefunction`

Accessing the number of orbitals in a given site would then be a simple 
call to a `size()` member function:


`wavefunction[i].size()'.

The benefit I am seeing to this approach is shorter and more readable 
code, from the developer point of view, because we would use for 
example tinarray’s functions for expectation values of operators, for 
example for the hamiltonian:


`ta.dot(ta.conjugate(ta.transpose(wavefunction(i)), 
ta.dot(syst.hamiltonian(i,j, params), wavefunction[j])`


If the * operator is defined between tinyarray, and with a member 
function `dagger’, we could even write:


`wavefunction[i].dagger() * syst.hamiltonian(i,j, params) * 
wavefunction[j]`


And if one really needs to access orbitals, only an additionnal bracket 
is needed:


`wavefunction[site] [orb]'

Using C++ vectors or Cython arrays of `ta.array`s to represent wave 
functions would be as fast as the current representation, or am I wrong 
?What were your reasons against such an approach ?



Have a nice day!

Adel