Re: [sage-support] Re: Sampling elements from a vector space

2014-05-12 Thread Gerli Viikmaa

Thank you - this has sped up the dataset creation 2 times.

This is the resulting function:

def generate_data(field, length, n):
data = zero_matrix(field, n, length)
for i in range(n):
for j in range(length):
data[i,j] = field.random_element()
return data

Here the vectors are rows, so I can call sequences[i] to get the ith 
vector.


%time sequences = generate_data(GF(4,'a'), 36, 10^6)
CPU times: user 26.28 s, sys: 0.56 s, total: 26.84 s
Wall time: 26.93 s

compared to

space = VectorSpace(GF(4,'a'), 36)
%time sequences=[space.random_element() for __ in range(10^6)]
CPU times: user 58.41 s, sys: 0.20 s, total: 58.60 s
Wall time: 58.75 s

I do wonder if the filling up can be done even better...

Gerli

On 11/05/14 23:34, Dima Pasechnik wrote:

On 2014-05-11, Gerli Viikmaa gerliviik...@gmail.com wrote:

Hi,

Thank you for your reply.

This doesn't improve the time at all (which is my main concern). I am
still looking for a different way of generating this data, something
that would be at least 5-10 times faster than my proposed way.

the problem is apparently the slow creation of elements of the space,
not the process of generating random elements.
Here are results of profiling:

sage: %prun sequences=[space.random_element() for __ in range(5)]
  6050003 function calls in 393.762 seconds

 Ordered by: internal time

  ncalls tottime  percall  cumtime  percall filename:lineno(function)
  5  386.048  0.008  386.1850.008 
free_module.py:1903(zero_vector)
  5  4.5330.000  393.5740.008 
free_module.py:4609(random_element)
  1801.0920.0001.7550.000 
finite_field_givaro.py:208(random_element)
  1800.6630.0000.6630.000 {method 'random_element' of 
'sage.rings.finite_rings.element_givaro.Cache_gi}
  1800.3910.0000.3910.000 {method 'random' of 
'_random.Random' objects}
  5  0.3800.000  386.7420.008 free_module.py:5012(__call__)
.. some entries, not taking much time at all, deleted...

If you look at the random_element() code in sage/modules/free_module.py
(starting at the line 4609, see e.g.
https://github.com/sagemath/sage/blob/master/src/sage/modules/free_module.py)

then you see that it calls self(0), i.e. zero_vector(),
and this is what eats up almost all the time.

It will be much faster to allocate a zero matrix
(zero_matrix(GF(4, 'a'), 36,10^6))
and fill it in, basically using the random_element() code,
adapted appropriately.

HTH,
Dima


Gerli

On 11/05/14 22:23, Dima Pasechnik wrote:

On 2014-05-11, Gerli Viikmaa gerliviik...@gmail.com wrote:

Hi,

I am trying to analyse sets of random vectors from a vector space GF(4)^36.
For that I need to sample rather large sets (up to 10^6 vectors) many times
(I would like it to be 1000, depending on how fast I can analyse).

I first thought my analysis was slow on such large sets but apparently just
generating the sets takes an incredibly long time!

What is the preferred method for efficiently generating sets of random
vectors?

Setup:
space = VectorSpace(GF(4, 'a'), 36)
n = 10^6


I have tried the following methods:

sample(space, n)
gives me
OverflowError: Python int too large to convert to C long

An attempt to sample indexes and then ask for space[i] instead:
sample(range(4^36), n)
also results in
OverflowError: range() result has too many items

Trying to use space.random_element():
First I tried to get unique samples:
sequences = []
while len(sequences)  n:
  elem = space.random_element()
  if elem not in sequences:
  sequences.append(elem)
but this takes forever (and is impossible to interrupt). I let it run for
several minutes and realized this was not going to work. I don't know how
long it would actually take. I cannot use a set since the vectors are
mutable (although, I must admit I haven't tried turning them immutable and
seeing if using a set works better).

yes, you should make them immutable.
See below how.


Then I decided not to care about the uniqueness:
%time sequences=[space.random_element() for __ in range(n)]
Best so far (in that it actually gives me a result). This takes about *60
seconds* on my computer (based on a couple runs). Using xrange didn't
affect the time.

Is it possible to improve this time? And I would prefer it if the set
didn't contain any duplicates.

just do the following:

sequences=[space.random_element() for __ in range(n)]
for i in sequences:
 i.set_immutable()
seq_noreps=set(sequences) # now there are no repetitions

HTH,
Dima


If generating the dataset takes a whole
minute, it would take me over two weeks just to generate 1000 datasets of
this size...

Thanks in advance,
Gerli



--
You received this message because you are subscribed to the Google Groups 
sage-support group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-support+unsubscr...@googlegroups.com

[sage-support] Immutable HammingCode?

2014-05-12 Thread Gerli Viikmaa
Hi,

Consider the following code (the field is arbitrary in this case)
sage: code = codes.HammingCode(2, GF(4,'a'))

The codewords (vectors) cannot be changed (note the absence of an error):
sage: code[0]
(0, 0, 0, 0, 0)
sage: code[0][0] = 1
sage: code[0]
(0, 0, 0, 0, 0)

yet they are not immutable, even when trying to set them so:
sage: code[0].is_immutable()
False
sage: code[0].set_immutable()
sage: code[0].is_immutable()
False

Is there any particular reason why the codewords are not immutable if they 
cannot be changed anyway?

If they were, using them as dictionary keys would be less of a hassle:
sage: {code[0]: element}
TypeError: mutable vectors are unhashable

I guess the current options are
sage: v = code[0]
sage: v.set_immutable()
sage: {v: element}
or 
sage: {tuple(code[0]): element}
neither of which is particularly elegant.

Is there a way to set the all of the codewords as immutable?

I'm using Sage version 6.1.1 on a Linux machine.

Gerli

-- 
You received this message because you are subscribed to the Google Groups 
sage-support group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-support+unsubscr...@googlegroups.com.
To post to this group, send email to sage-support@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.


[sage-support] Sampling elements from a vector space

2014-05-11 Thread Gerli Viikmaa
Hi,

I am trying to analyse sets of random vectors from a vector space GF(4)^36. 
For that I need to sample rather large sets (up to 10^6 vectors) many times 
(I would like it to be 1000, depending on how fast I can analyse).

I first thought my analysis was slow on such large sets but apparently just 
generating the sets takes an incredibly long time!

What is the preferred method for efficiently generating sets of random 
vectors? 

Setup:
space = VectorSpace(GF(4, 'a'), 36)
n = 10^6 


I have tried the following methods:

sample(space, n)
gives me
OverflowError: Python int too large to convert to C long

An attempt to sample indexes and then ask for space[i] instead:
sample(range(4^36), n)
also results in 
OverflowError: range() result has too many items

Trying to use space.random_element():
First I tried to get unique samples:
sequences = []
while len(sequences)  n:
elem = space.random_element()
if elem not in sequences:
sequences.append(elem)
but this takes forever (and is impossible to interrupt). I let it run for 
several minutes and realized this was not going to work. I don't know how 
long it would actually take. I cannot use a set since the vectors are 
mutable (although, I must admit I haven't tried turning them immutable and 
seeing if using a set works better).

Then I decided not to care about the uniqueness:
%time sequences=[space.random_element() for __ in range(n)]
Best so far (in that it actually gives me a result). This takes about *60 
seconds* on my computer (based on a couple runs). Using xrange didn't 
affect the time.

Is it possible to improve this time? And I would prefer it if the set 
didn't contain any duplicates. If generating the dataset takes a whole 
minute, it would take me over two weeks just to generate 1000 datasets of 
this size...

Thanks in advance,
Gerli

-- 
You received this message because you are subscribed to the Google Groups 
sage-support group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-support+unsubscr...@googlegroups.com.
To post to this group, send email to sage-support@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.


Re: [sage-support] Re: Sampling elements from a vector space

2014-05-11 Thread Gerli Viikmaa

Hi,

Thank you for your reply.

This doesn't improve the time at all (which is my main concern). I am 
still looking for a different way of generating this data, something 
that would be at least 5-10 times faster than my proposed way.


Gerli

On 11/05/14 22:23, Dima Pasechnik wrote:

On 2014-05-11, Gerli Viikmaa gerliviik...@gmail.com wrote:

Hi,

I am trying to analyse sets of random vectors from a vector space GF(4)^36.
For that I need to sample rather large sets (up to 10^6 vectors) many times
(I would like it to be 1000, depending on how fast I can analyse).

I first thought my analysis was slow on such large sets but apparently just
generating the sets takes an incredibly long time!

What is the preferred method for efficiently generating sets of random
vectors?

Setup:
space = VectorSpace(GF(4, 'a'), 36)
n = 10^6


I have tried the following methods:

sample(space, n)
gives me
OverflowError: Python int too large to convert to C long

An attempt to sample indexes and then ask for space[i] instead:
sample(range(4^36), n)
also results in
OverflowError: range() result has too many items

Trying to use space.random_element():
First I tried to get unique samples:
sequences = []
while len(sequences)  n:
 elem = space.random_element()
 if elem not in sequences:
 sequences.append(elem)
but this takes forever (and is impossible to interrupt). I let it run for
several minutes and realized this was not going to work. I don't know how
long it would actually take. I cannot use a set since the vectors are
mutable (although, I must admit I haven't tried turning them immutable and
seeing if using a set works better).

yes, you should make them immutable.
See below how.


Then I decided not to care about the uniqueness:
%time sequences=[space.random_element() for __ in range(n)]
Best so far (in that it actually gives me a result). This takes about *60
seconds* on my computer (based on a couple runs). Using xrange didn't
affect the time.

Is it possible to improve this time? And I would prefer it if the set
didn't contain any duplicates.

just do the following:

sequences=[space.random_element() for __ in range(n)]
for i in sequences:
i.set_immutable()
seq_noreps=set(sequences) # now there are no repetitions

HTH,
Dima


If generating the dataset takes a whole
minute, it would take me over two weeks just to generate 1000 datasets of
this size...

Thanks in advance,
Gerli



--
You received this message because you are subscribed to the Google Groups 
sage-support group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-support+unsubscr...@googlegroups.com.
To post to this group, send email to sage-support@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.


[sage-support] Correctness of Extended Quadratic Residue Code over GF(4)

2014-03-24 Thread Gerli Viikmaa
Hi,

I'm working on vectors of varying sizes on GF(4) and I'm currently trying to 
implement the code given in 
http://www.iks.kit.edu/home/grassl/codetables/BKLC/BKLC.php?q=4n=8k=2

The first step (Extend the QRCode over GF(4) of length 11) should give me a 
[12, 6, 6] linear code - vectors of length 12, dimension 6, minimum (Hamming) 
distance 6.

But the execution
sage: codes.ExtendedQuadraticResidueCode(11, GF(4,'a'))
Linear code of length 12, dimension 11 over Finite Field in a of size 2^2

gives me a linear code of dimension 11. If my field size is prime, then I do 
get a code with dimension 6. According to that source, the dimension should be 
6 with field size 4 as well. Why isn't it?

I'm running Sage Version 6.1.1, Release Date: 2014-02-04 on Ubuntu 12.04.

Thank you in advance,
Gerli

-- 
You received this message because you are subscribed to the Google Groups 
sage-support group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-support+unsubscr...@googlegroups.com.
To post to this group, send email to sage-support@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.


Re: [sage-support] Correctness of Extended Quadratic Residue Code over GF(4)

2014-03-24 Thread Gerli Viikmaa
Hi,

Thank you for your reply. I will look into the implementation of QR codes over 
non-prime fields. 

All the best,
Gerli

esmaspäev, 24. märts 2014 12:54.29 UTC+2 kirjutas David Joyner:
 On Mon, Mar 24, 2014 at 6:28 AM, Gerli Viikmaa wrote:
 
  Hi,
 
 
 
  I'm working on vectors of varying sizes on GF(4) and I'm currently trying 
  to implement the code given in 
  http://www.iks.kit.edu/home/grassl/codetables/BKLC/BKLC.php?q=4n=8k=2
 
 
 
  The first step (Extend the QRCode over GF(4) of length 11) should give me a 
  [12, 6, 6] linear code - vectors of length 12, dimension 6, minimum 
  (Hamming) distance 6.
 
 
 
  But the execution
 
  sage: codes.ExtendedQuadraticResidueCode(11, GF(4,'a'))
 
  Linear code of length 12, dimension 11 over Finite Field in a of size 2^2
 
 
 
  gives me a linear code of dimension 11. If my field size is prime, then I 
  do get a code with dimension 6. According to that source, the dimension 
  should be 6 with field size 4 as well. Why isn't it?
 
 
 
 
 
 Quadratic residue codes are defined over prime fields:
 
 http://en.wikipedia.org/wiki/Quadratic_residue_code
 
 If there is a generalization of QR codes to the extension field case,
 
 it is not implemented in Sage. Sorry.
 
 Feel free to implement it and submit a patch:-)  (Alternatively, share
 
 your code on sage-devel.)
 
 
 
  I'm running Sage Version 6.1.1, Release Date: 2014-02-04 on Ubuntu 12.04.
 
 
 
  Thank you in advance,
 
  Gerli
 
 
 
  --
 
  You received this message because you are subscribed to the Google Groups 
  sage-support group.
 
  To unsubscribe from this group and stop receiving emails from it, send an 
  email to sage-support+unsubscr...@googlegroups.com.
 
  To post to this group, send email to sage-support@googlegroups.com.
 
  Visit this group at http://groups.google.com/group/sage-support.
 
  For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
sage-support group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-support+unsubscr...@googlegroups.com.
To post to this group, send email to sage-support@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.