Dear all, for complex files, I would suggest SQLite (sqlite.org). It is open, scalable, and extremely rich due to SQL queries. I use it for all my more complex datasets, interlinked tables, etc...
My five cents, Raphael On 26.05.2014 06:50, Dilip Damle wrote: > Hello, > > I think we need to discuss the following > > 1. When is the data eligible to go to Repository > > There could be several factors here. Mainly cleanliness and completeness. > > 2. Place other than Repository for temporary data. > I think it should surely not be "only an attachment to a post here" > Then it becomes difficult to find later > Administrators should decide on suitable place > > 3. The particular formats itself > > This could vary based on type of data > > My observations is that for many types of data Multiple Linked Tables > serve better than a single CSV file which is more common. > In this case is .mdb acceptable or is there any other open format for > linked tables. > > this could be a long topic... > > 4. Compressing multiple files in one file > > Unless there is a reason multiple files that go together should be > bundled in to one file. > This should also be true for repository. > > 5. About the content itself > > Since multiple people will contribute/edit to data we will have to have > some rules. > example : when there is a Unique for the data it should always be used > otherwise combining comparing the data becomes difficult. > ( presently I am trying to collate the election results data and find > there are differences in the different sources especially in the Names > of places. Will be putting up the collated data in .mdb format in a few > days) > > On Friday, May 23, 2014 10:06:35 AM UTC+5:30, Nisha Thompson wrote: > > In the discussion guidelines thread Dilip suggested we have some > data sharing guidelines and a place to store some of the more casual > datasets, people are cleaning up. > > I think its a good idea. > > Can we use this thread as a place to discuss formats, procedure, and > a good place to put it. > > We have a github already set up, we can start with that, maybe > create a project called - Data that needs to be cleaned up. > > Any other suggestions? > > Nisha > > -- > Nisha Thompson > DataMeet.org > ni...@datameet.org <javascript:> > skype: nishaqt > mobile: 962-061-2245 > > -- > Datameet is a community of Data Science enthusiasts in India. Know more > about us by visiting http://datameet.org > --- > You received this message because you are subscribed to the Google > Groups "datameet" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to datameet+unsubscr...@googlegroups.com > <mailto:datameet+unsubscr...@googlegroups.com>. > For more options, visit https://groups.google.com/d/optout. -- Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany Web & Twitter | http://www.raphael-susewind.de | @RaphaelSusewind Please do consider http://www.gnupg.org for encryption (key id A5ED49AE) -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.