Dear all,

for complex files, I would suggest SQLite (sqlite.org). It is open,
scalable, and extremely rich due to SQL queries. I use it for all my
more complex datasets, interlinked tables, etc...

My five cents,
Raphael

On 26.05.2014 06:50, Dilip Damle wrote:
> Hello,
> 
> I think we need to discuss the following
> 
> 1. When is the data eligible to go to Repository
> 
> There could be several factors here. Mainly cleanliness and completeness.
> 
> 2. Place other than Repository for temporary data.
> I think it should surely not be "only an attachment to a post here"
> Then it becomes difficult to find later
> Administrators should decide on suitable place
> 
> 3. The particular formats itself
> 
> This could vary based on type of data
> 
> My observations  is that  for many types of data  Multiple Linked Tables
> serve better than a single CSV file which is more common.
> In this case is .mdb acceptable or is there any other open format for
> linked tables.
> 
> this could be a long topic...
> 
> 4. Compressing multiple files in one file
> 
> Unless there is a reason multiple files that go together should be
> bundled in to one file.
> This should also be true for repository.
> 
> 5. About the content itself
> 
> Since multiple people will contribute/edit to data we will have to have
> some rules.
> example : when there is a Unique for the data it should always be used
> otherwise combining comparing the data becomes difficult.
> ( presently I am trying to collate the election results data and find
> there are differences in the different sources especially in the Names
> of places. Will be putting up the collated data in .mdb format in a few
> days)
> 
> On Friday, May 23, 2014 10:06:35 AM UTC+5:30, Nisha Thompson wrote:
> 
>     In the discussion guidelines thread Dilip suggested we have some
>     data sharing guidelines and a place to store some of the more casual
>     datasets, people are cleaning up.
> 
>     I think its a good idea.
> 
>     Can we use this thread as a place to discuss formats, procedure, and
>     a good place to put it.  
> 
>     We have a github already set up, we can start with that, maybe
>     create a project called - Data that needs to be cleaned up.  
> 
>     Any other suggestions?
> 
>     Nisha
> 
>     -- 
>     Nisha Thompson
>     DataMeet.org
>     ni...@datameet.org <javascript:>
>     skype: nishaqt
>     mobile: 962-061-2245
> 
> -- 
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google
> Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to datameet+unsubscr...@googlegroups.com
> <mailto:datameet+unsubscr...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

-- 
Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
      Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
   Web & Twitter | http://www.raphael-susewind.de | @RaphaelSusewind

Please do consider http://www.gnupg.org for encryption (key id A5ED49AE)

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to