On 1/06/2010 10:51 PM, Wes McKinney wrote:
>
> This is a pretty good example of the "group-by" problem that will
> hopefully work its way into a future edition of NumPy.
Wes (or anyone else), please can you elaborate on any plans for groupby?
I've made my own modification to numpy.bincount f
the output needs to be;
and (ii) allows you to control the size of the output array, as you may
want it bigger than the number of bins would suggest.
I look forward to the draft NEP!
Best regards
Stephen Simmons
On 13/04/2010 10:34 PM, Robert Kern wrote:
> On Sat, Apr 10, 2010 at 17
cs.mit.edu/projects/cstore/vldb.pdf).
Stephen
Francesc Alted wrote:
> A Friday 30 October 2009 14:18:05 Stephen Simmons escrigué:
>
>> - Pytables (HDF using chunked storage for recarrays with LZO
>> compression and shuffle filter)
>> - can't extract individual field
Hi,
Is anyone working on alternative storage options for numpy arrays, and
specifically recarrays? My main application involves processing series
of large recarrays (say 1000 recarrays, each with 5M rows having 50
fields). Existing options meet some but not all of my requirements.
Requirement
David Warde-Farley wrote:
> On 23-May-09, at 4:25 PM, Albert Thuswaldner wrote:
>> Actually my vision with pyhdf5io is to have hdf5 to replace numpy's
>> own binary file format (.npy, npz). Pyhdf5io (or an incarnation of it)
>> should be the standard (binary) way to store data in scipy/numpy. A
>>
Wei Su wrote:
Hi, Francesc:
Thanks a lot for offering me help. My code is really simple as of now.
**
from pyodbc import *
from rpy import *
cnxn = connect(/'DRIVER={SQL
Server};SERVER=srdata01\\sql2k5;DATAB
Hi,
Please can someone suggest resources for learning how to use the
'repeat' macros in numpy C code to avoid repeating sections of
type-specific code for each data type? Ideally there would be two types
of resources: (i) a description of how the repeat macros are meant to be
used/compiled; an
Hi,
Can anyone help me out with a simple way to vectorize this loop?
# idx and vals are arrays with indexes and values used to update array data
# data = numpy.ndarray(shape=(100,100,100,100), dtype='f4')
flattened = data.ravel()
for i in range(len(vals)):
flattened[idx[i]]+=vals[i]
Many th
Hi Andrew,
Do you have any plans to support LZO compression in h5py?
I have lots of LZO-compressed datasets created with PyTables.
There's a real barrier to using both h5py and PyTables if the fast
decompressor options are just LZF on h5py and LZO on PyTables.
Many thanks
Stephen
Andrew Colle
Do you have any plans to add lzo compression support, in addition to
gzip? This is a feature I used a lot in PyTables.
Andrew Collette wrote:
> =
> Announcing HDF5 for Python (h5py) 1.0
> =
>
> What is h5py?
> -
>
Hi,
Has anyone written a parser for SQL-like queries against PyTables HDF
tables or numpy recarrays?
I'm asking because I have written code for grouping then summing rows of
source data, where the groups are defined by functions of the source
data, or looking up a related field in a separate l
ting strings to integers
*
* Author: Stephen Simmons, [EMAIL PROTECTED]
* Date:11 March 2007
*
* This module contains C code for functions I am using to accelerate
* SQL-like aggregate functions for a column-oriented database based on
numpy.
*
* subtotal's bincount is typically 3-10 times
Hi,
I'd like to propose some minor modifications to the function
bincount(arr, weights=None), so would like some feedback from other uses
of bincount() before I write this up as a proper patch, .
Background:
bincount() has two forms:
- bincount(x) returns an integer array ians of length max(x)+
Charles R Harris wrote:
>
>
> On 2/3/07, *Stephen Simmons* <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
> Hi,
>
> Does anyone know why there is an order of magnitude difference
> in the speed of numpy's array.sum() function depe
Hi,
Does anyone know why there is an order of magnitude difference
in the speed of numpy's array.sum() function depending on the axis
of the matrix summed?
To see this, import numpy and create a big array with two rows:
>>> import numpy
>>> a = numpy.ones([2,100], 'f4')
Then using
heers, and thanks for any further suggestions,
Stephen
Francesc Altet <[EMAIL PROTECTED]> wrote:
> A Divendres 29 Desembre 2006 10:05, Stephen Simmons escrigué:
> > Hi,
> >
> > I'm looking for efficient ways to subtotal a 1-d array onto a 2-D grid.
> > This
Hi,
I'm looking for efficient ways to subtotal a 1-d array onto a 2-D grid. This
is more easily explained in code that words, thus:
for n in xrange(len(data)):
totals[ i[n], j[n] ] += data[n]
data comes from a series of PyTables files with ~200m rows. Each row has ~20
cols, and I use the fir
17 matches
Mail list logo