Re: dataframe implementations

2015-12-03 Thread Jay Norwood via Digitalmars-d-learn
On Saturday, 21 November 2015 at 14:16:26 UTC, Laeeth Isharc wrote: Not sure it is a great idea to use a variant as the basic option when very often you will know that every cell in a particular column will be of the same type. I'm reading today about an n-dim extension to pandas named

Re: dataframe implementations

2015-11-21 Thread Laeeth Isharc via Digitalmars-d-learn
On Thursday, 19 November 2015 at 22:14:01 UTC, ZombineDev wrote: On Thursday, 19 November 2015 at 06:33:06 UTC, Jay Norwood wrote: On Wednesday, 18 November 2015 at 22:46:01 UTC, jmh530 wrote: My sense is that any data frame implementation should try to build on the work that's being done with

Re: dataframe implementations

2015-11-20 Thread jmh530 via Digitalmars-d-learn
On Thursday, 19 November 2015 at 06:33:06 UTC, Jay Norwood wrote: Maybe the nd slices could be applied if you considered each row to be the same structure, and slice by rows rather than operating on columns. Pandas supports a multi-dimension panel. Maybe this would be the application for

Re: dataframe implementations

2015-11-19 Thread John Colvin via Digitalmars-d-learn
On Thursday, 19 November 2015 at 06:33:06 UTC, Jay Norwood wrote: On Wednesday, 18 November 2015 at 22:46:01 UTC, jmh530 wrote: My sense is that any data frame implementation should try to build on the work that's being done with n-dimensional slices. I've been watching that development, but

Re: dataframe implementations

2015-11-19 Thread ZombineDev via Digitalmars-d-learn
On Thursday, 19 November 2015 at 06:33:06 UTC, Jay Norwood wrote: On Wednesday, 18 November 2015 at 22:46:01 UTC, jmh530 wrote: My sense is that any data frame implementation should try to build on the work that's being done with n-dimensional slices. I've been watching that development, but

Re: dataframe implementations

2015-11-18 Thread Laeeth Isharc via Digitalmars-d-learn
On Monday, 2 November 2015 at 13:54:09 UTC, Jay Norwood wrote: I was reading about the Julia dataframe implementation yesterday, trying to understand their decisions and how D might implement. From my notes, 1. they are currently using a dictionary of column vectors. 2. for NA (not available)

Re: dataframe implementations

2015-11-18 Thread Laeeth Isharc via Digitalmars-d-learn
On Tuesday, 17 November 2015 at 13:56:14 UTC, Jay Norwood wrote: I looked through the dataframe code and a couple of comments... I had thought perhaps an app could read in the header info and type info from hdf5, and generate D struct definitions with column headers as symbol names. That

Re: dataframe implementations

2015-11-18 Thread Jay Norwood via Digitalmars-d-learn
On Wednesday, 18 November 2015 at 17:15:38 UTC, Laeeth Isharc wrote: What do you think about the use of NaN for missing floats? In theory I could imagine wanting to distinguish between an NaN in the source file and a missing value, but in my world I never felt the need for this. For integers

Re: dataframe implementations

2015-11-18 Thread Jay Norwood via Digitalmars-d-learn
On Wednesday, 18 November 2015 at 18:04:30 UTC, Jay Norwood wrote: vector. I'll try to find the discussions and post the link. Here are the two discussions I recall on the julia NA implementation. http://wizardmac.tumblr.com/post/104019606584/whats-wrong-with-statistics-in-julia-a-reply

Re: dataframe implementations

2015-11-18 Thread Jay Norwood via Digitalmars-d-learn
On Wednesday, 18 November 2015 at 22:46:01 UTC, jmh530 wrote: My sense is that any data frame implementation should try to build on the work that's being done with n-dimensional slices. I've been watching that development, but I don't have a feel for where it could be applied in this case,

Re: dataframe implementations

2015-11-18 Thread jmh530 via Digitalmars-d-learn
On Monday, 2 November 2015 at 13:54:09 UTC, Jay Norwood wrote: I saw someone posting that they were working on DataFrame implementation here, but haven't been able to locate any code in github, and was wondering what implementation decisions are being made here. Thanks. My sense is that

Re: dataframe implementations

2015-11-18 Thread Jay Norwood via Digitalmars-d-learn
One more discussion link on the NA subject. This one on the R implementation of NA using a single encoding of NaN, as well as their treatment of a selected integer value as a NA. http://rsnippets.blogspot.com/2013/12/gnu-r-vs-julia-is-it-only-matter-of.html

Re: dataframe implementations

2015-11-17 Thread Jay Norwood via Digitalmars-d-learn
I looked through the dataframe code and a couple of comments... I had thought perhaps an app could read in the header info and type info from hdf5, and generate D struct definitions with column headers as symbol names. That would enable faster processing than with the associative arrays, as

dataframe implementations

2015-11-02 Thread Jay Norwood via Digitalmars-d-learn
I was reading about the Julia dataframe implementation yesterday, trying to understand their decisions and how D might implement. From my notes, 1. they are currently using a dictionary of column vectors. 2. for NA (not available) they are currently using an array of bytes, effectively as a

Re: dataframe implementations

2015-11-02 Thread Laeeth Isharc via Digitalmars-d-learn
On Monday, 2 November 2015 at 13:54:09 UTC, Jay Norwood wrote: I was reading about the Julia dataframe implementation yesterday, trying to understand their decisions and how D might implement. From my notes, 1. they are currently using a dictionary of column vectors. 2. for NA (not available)

Re: dataframe implementations

2015-11-02 Thread Jay Norwood via Digitalmars-d-learn
On Monday, 2 November 2015 at 15:33:34 UTC, Laeeth Isharc wrote: Hi Jay. That may have been me. I have implemented something very basic, but you can read and write my proto dataframe to/from CSV and HDF5. The code is up here: https://github.com/Laeeth/d_dataframes yes, thanks. I