data design
The applications I write are made of, lets say, algorithms and data. I mean constant data, dicts, tables, etc: to keep algorithms simple, describe what is peculiar, data dependent, as data rather than "case statements". These could be called configuration data. The lazy way to do this: have modules that initialize bunches of objects, attributes holding the data: the object is somehow the row of the "table", attribute names being the column. This is the way I proceeded up to now. Data input this way are almost "configuration data", with 2 big drawbacks: - Only a python programmer can fix the file: this cant be called a configuration file. - Even for the author, these data aint easy to maintain. I feel pretty much ready to change this: - make these data true text data, easier to read and fix. - write the module that will make python objects out of these data: the extra cost should yield ease of use. 2 questions arise: - which kind of text data? - csv: ok for simple attributes, not easy for lists or complex data. - xml: the form wont be easier to read than python code, but an xml editor could be used, and a formal description of what is expected can be used. - how can I make the data-to-object transformation both easy, and able to spot errors in text data? Last, but not least: is there a python lib implementing at least part of this dream? -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
Imbaud Pierre wrote: > The applications I write are made of, lets say, algorithms and data. > I mean constant data, dicts, tables, etc: to keep algorithms simple, > describe what is peculiar, data dependent, as data rather than "case > statements". These could be called configuration data. > > The lazy way to do this: have modules that initialize bunches of > objects, attributes holding the data: the object is somehow the row of > the "table", attribute names being the column. This is the way I > proceeded up to now. > Data input this way are almost "configuration data", with 2 big > drawbacks: > - Only a python programmer can fix the file: this cant be called a >configuration file. > - Even for the author, these data aint easy to maintain. > > I feel pretty much ready to change this: > - make these data true text data, easier to read and fix. > - write the module that will make python objects out of these data: > the extra cost should yield ease of use. > > 2 questions arise: > - which kind of text data? > - csv: ok for simple attributes, not easy for lists or complex > data. > - xml: the form wont be easier to read than python code, > but an xml editor could be used, and a formal description > of what is expected can be used. > - how can I make the data-to-object transformation both easy, and able > to spot errors in text data? > > Last, but not least: is there a python lib implementing at least part > of this dream? Use the configurations module. It was built to provide a way to parse configuration files that provide configuration data to program. It is VERY fast so the overhead to parse even thousands of lines of config data is extremely small. I use it a LOT and it is very flexible and the format of the files is easy for users/programmers to work with. -Larry Bates -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
> The lazy way to do this: have modules that initialize bunches of > objects, attributes holding the data: the object is somehow the row of > the "table", attribute names being the column. This is the way I > proceeded up to now. > Data input this way are almost "configuration data", with 2 big > drawbacks: > - Only a python programmer can fix the file: this cant be called a > configuration file. > - Even for the author, these data aint easy to maintain. > > I feel pretty much ready to change this: > - make these data true text data, easier to read and fix. > - write the module that will make python objects out of these data: > the extra cost should yield ease of use. > > 2 questions arise: > - which kind of text data? > - csv: ok for simple attributes, not easy for lists or complex > data. > - xml: the form wont be easier to read than python code, >but an xml editor could be used, and a formal description >of what is expected can be used. > - how can I make the data-to-object transformation both easy, and able >to spot errors in text data? > > Last, but not least: is there a python lib implementing at least part > of this dream? there is a csv parser and multiple xml parsers in python (eg xml.etree) also there is a ConfigParser module (able to parse .ini like config files) i personally like the python module as config file the most eg if you need a bunch of key-value pairs or lists of data: * python's syntax is pretty nice (dict, tuples and lists or just key=value) * xml is absolutely out of question * csv is very limited * .ini like config file for more complex stuff is not bad but then you can use .py as well. -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
Szabolcs Nagy a écrit : >>The lazy way to do this: have modules that initialize bunches of >>objects, attributes holding the data: the object is somehow the row of >>the "table", attribute names being the column. This is the way I >>proceeded up to now. >>Data input this way are almost "configuration data", with 2 big >>drawbacks: >> - Only a python programmer can fix the file: this cant be called a >>configuration file. >> - Even for the author, these data aint easy to maintain. >> >>I feel pretty much ready to change this: >>- make these data true text data, easier to read and fix. >>- write the module that will make python objects out of these data: >>the extra cost should yield ease of use. >> >>2 questions arise: >>- which kind of text data? >> - csv: ok for simple attributes, not easy for lists or complex >> data. >> - xml: the form wont be easier to read than python code, >> but an xml editor could be used, and a formal description >> of what is expected can be used. >>- how can I make the data-to-object transformation both easy, and able >> to spot errors in text data? >> >>Last, but not least: is there a python lib implementing at least part >>of this dream? > > > there is a csv parser and multiple xml parsers in python (eg > xml.etree) I used both. both are ok, but only bring a low layer parsing. > also there is a ConfigParser module (able to parse .ini > like config files) Used this years ago, I had forgotten. Another fine data text format. > > i personally like the python module as config file the most > > eg if you need a bunch of key-value pairs or lists of data: > * python's syntax is pretty nice (dict, tuples and lists or just > key=value) But only python programmer editable! > * xml is absolutely out of question > * csv is very limited > * .ini like config file for more complex stuff is not bad but then you > can use .py as well. > Thanks a lot for your advices. -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
Larry Bates a écrit : > Imbaud Pierre wrote: > >>The applications I write are made of, lets say, algorithms and data. >>I mean constant data, dicts, tables, etc: to keep algorithms simple, >>describe what is peculiar, data dependent, as data rather than "case >>statements". These could be called configuration data. >> >>The lazy way to do this: have modules that initialize bunches of >>objects, attributes holding the data: the object is somehow the row of >>the "table", attribute names being the column. This is the way I >>proceeded up to now. >>Data input this way are almost "configuration data", with 2 big >>drawbacks: >> - Only a python programmer can fix the file: this cant be called a >> configuration file. >> - Even for the author, these data aint easy to maintain. >> >>I feel pretty much ready to change this: >>- make these data true text data, easier to read and fix. >>- write the module that will make python objects out of these data: >>the extra cost should yield ease of use. >> >>2 questions arise: >>- which kind of text data? >>- csv: ok for simple attributes, not easy for lists or complex >>data. >>- xml: the form wont be easier to read than python code, >> but an xml editor could be used, and a formal description >> of what is expected can be used. >>- how can I make the data-to-object transformation both easy, and able >> to spot errors in text data? >> >>Last, but not least: is there a python lib implementing at least part >>of this dream? > > > Use the configurations module. It was built to provide a way to parse > configuration files that provide configuration data to program. It is > VERY fast so the overhead to parse even thousands of lines of config > data is extremely small. I use it a LOT and it is very flexible and > the format of the files is easy for users/programmers to work with. > > -Larry Bates U mean configParser? Otherwise be more specific (if U dont mind...) -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
Imbaud Pierre wrote: > Larry Bates a écrit : >> Imbaud Pierre wrote: >> >>> The applications I write are made of, lets say, algorithms and data. >>> I mean constant data, dicts, tables, etc: to keep algorithms simple, >>> describe what is peculiar, data dependent, as data rather than "case >>> statements". These could be called configuration data. >>> >>> The lazy way to do this: have modules that initialize bunches of >>> objects, attributes holding the data: the object is somehow the row of >>> the "table", attribute names being the column. This is the way I >>> proceeded up to now. >>> Data input this way are almost "configuration data", with 2 big >>> drawbacks: >>> - Only a python programmer can fix the file: this cant be called a >>> configuration file. >>> - Even for the author, these data aint easy to maintain. >>> >>> I feel pretty much ready to change this: >>> - make these data true text data, easier to read and fix. >>> - write the module that will make python objects out of these data: >>> the extra cost should yield ease of use. >>> >>> 2 questions arise: >>> - which kind of text data? >>>- csv: ok for simple attributes, not easy for lists or complex >>>data. >>>- xml: the form wont be easier to read than python code, >>> but an xml editor could be used, and a formal description >>> of what is expected can be used. >>> - how can I make the data-to-object transformation both easy, and able >>> to spot errors in text data? >>> >>> Last, but not least: is there a python lib implementing at least part >>> of this dream? >> >> >> Use the configurations module. It was built to provide a way to parse >> configuration files that provide configuration data to program. It is >> VERY fast so the overhead to parse even thousands of lines of config >> data is extremely small. I use it a LOT and it is very flexible and >> the format of the files is easy for users/programmers to work with. >> >> -Larry Bates > U mean configParser? Otherwise be more specific (if U dont mind...) Sorry, yes I meant configParser module. Had a little "brain disconnect" there. -Larry -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
On Jan 30, 2:34 pm, Imbaud Pierre <[EMAIL PROTECTED]> wrote: > The applications I write are made of, lets say, algorithms and data. > I mean constant data, dicts, tables, etc: to keep algorithms simple, > describe what is peculiar, data dependent, as data rather than "case > statements". These could be called configuration data. > > The lazy way to do this: have modules that initialize bunches of > objects, attributes holding the data: the object is somehow the row of > the "table", attribute names being the column. This is the way I > proceeded up to now. > Data input this way are almost "configuration data", with 2 big > drawbacks: > - Only a python programmer can fix the file: this cant be called a > configuration file. > - Even for the author, these data aint easy to maintain. > > I feel pretty much ready to change this: > - make these data true text data, easier to read and fix. > - write the module that will make python objects out of these data: > the extra cost should yield ease of use. > > 2 questions arise: > - which kind of text data? > - csv: ok for simple attributes, not easy for lists or complex > data. > - xml: the form wont be easier to read than python code, >but an xml editor could be used, and a formal description >of what is expected can be used. > - how can I make the data-to-object transformation both easy, and able >to spot errors in text data? > > Last, but not least: is there a python lib implementing at least part > of this dream? Google for YAML and JSON formats too. http://www.yaml.org/ http://www.json.org/ -Paddy -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
Paddy a écrit : > > On Jan 30, 2:34 pm, Imbaud Pierre <[EMAIL PROTECTED]> wrote: > >>The applications I write are made of, lets say, algorithms and data. >>I mean constant data, dicts, tables, etc: to keep algorithms simple, >>describe what is peculiar, data dependent, as data rather than "case >>statements". These could be called configuration data. >> >>The lazy way to do this: have modules that initialize bunches of >>objects, attributes holding the data: the object is somehow the row of >>the "table", attribute names being the column. This is the way I >>proceeded up to now. >>Data input this way are almost "configuration data", with 2 big >>drawbacks: >> - Only a python programmer can fix the file: this cant be called a >>configuration file. >> - Even for the author, these data aint easy to maintain. >> >>I feel pretty much ready to change this: >>- make these data true text data, easier to read and fix. >>- write the module that will make python objects out of these data: >>the extra cost should yield ease of use. >> >>2 questions arise: >>- which kind of text data? >> - csv: ok for simple attributes, not easy for lists or complex >> data. >> - xml: the form wont be easier to read than python code, >> but an xml editor could be used, and a formal description >> of what is expected can be used. >>- how can I make the data-to-object transformation both easy, and able >> to spot errors in text data? >> >>Last, but not least: is there a python lib implementing at least part >>of this dream? > > Google for YAML and JSON formats too. > http://www.yaml.org/ > http://www.json.org/ > > -Paddy > Hurray for yaml! A perfect fit for my need! And a swell tool! Thanks a lot! -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
> Hurray for yaml! A perfect fit for my need! And a swell tool! > Thanks a lot! i warn you against yaml it looks nice, but the underlying format is imho too complex (just look at their spec.) you said you don't want python source because that's too complex for the users. i must say that yaml is not easier to use than python data structures. if you want userfriedly config files then ConfigParser is the way to go. if you want somthing really simple and fast then i'd recommend s- expressions of lisp also here is an identation based xml-like tree/hierarchical data structure syntax: http://www.scottsweeney.com/projects/slip/ -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
Szabolcs Nagy wrote: >>Hurray for yaml! A perfect fit for my need! And a swell tool! >>Thanks a lot! > > > i warn you against yaml > it looks nice, but the underlying format is imho too complex (just > look at their spec.) > > you said you don't want python source because that's too complex for > the users. > i must say that yaml is not easier to use than python data structures. > > if you want userfriedly config files then ConfigParser is the way to > go. > > if you want somthing really simple and fast then i'd recommend s- > expressions of lisp > > also here is an identation based xml-like tree/hierarchical data > structure syntax: > http://www.scottsweeney.com/projects/slip/ > I've been spending the last 2 days weighing ConfigParser and yaml, with much thought and re-organizing of each file type. The underlying difference is that, conceptually, ini files are an absurdly limited subset of yaml in that ini files are basically limited to a map of a map. For instance, I have a copy_files section of a configuration. In order to know what goes with what you have to resort to gymnastics with the option names [copy_files] files_dir1 = this.file that.file path_dir1 = /some/path files_dir2 = the_other.file yet_another.file path_dir2 = /some/other/path In yaml, it might look thus. copy_files : - files : [this.file, that.file] path : /some/path - files : [the_other.file, yet_another.file] path : /some/other/path Both are readable (though I like equals signs in appearance over colons), but yaml doesn't require a lot of string processing to group the files with the paths. I don't even want to think the coding gymnastics required to split all of the option names and then group those with common suffixes. Now if the config file were for copying only, ini would be okay, because one could just have sections that group paths and dirs: [dir1] files = this.file, that.file path = /some/path [dir2] ... But if you need different kinds of sections, you have outgrown ini. In essence, ini is limited to a single dictionary of dictionaries while yama can express pretty much arbitrary complexity. James -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
On Jan 30, 8:06 pm, "Paddy" <[EMAIL PROTECTED]> wrote: > > Google for YAML and JSON formats too > YAML and JSON are good when used as data-interchange format, not as configuration files. These formats are too complex for non-programmers, so they will ask aid for every editing ;) I suggest ini-like files, parsed using ConfigParser, but you should have a look to ConfigObj that has got automatic type conversion and other interesting features -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
James Stroud a écrit : > Szabolcs Nagy wrote: > >>> Hurray for yaml! A perfect fit for my need! And a swell tool! >>> Thanks a lot! >> >> >> >> i warn you against yaml I feel both thanful, and sorry, for your warning. And not convinced yet, but Ill be cautious. >> it looks nice, but the underlying format is imho too complex (just >> look at their spec.) complex indeed, but real powerful. Is it not true that: if I used yaml, sticking to what .ini allows, yaml files would be simple? >> >> you said you don't want python source because that's too complex for >> the users. >> i must say that yaml is not easier to use than python data structures. Easier to read and write, U must agree. Surround strings with quotes is a python requirement, to distinguish them from identifiers. This only makes data input for python somewhat clumsy. Granted, its a new format to learn. But sharing this format with a much wider community than python, aint this worth the effort? (well, if yaml succeeds and spreads...) >> >> if you want userfriedly config files then ConfigParser is the way to go. Granted. for END users. I rather target administrators, programmers, integrators: make customization an easy process, and allowing this customization to go much farther than changing simple values, aint this the REAL challenge for new applications? >> >> if you want somthing really simple and fast then i'd recommend s- >> expressions of lisp lisp is more powerful than python. its syntax deterred many programmers, who adopted python, it will deter my targeted "customizers". And the process to translate to python structure, I have no idea. involves a python or lisp translater... >> >> also here is an identation based xml-like tree/hierarchical data >> structure syntax: >> http://www.scottsweeney.com/projects/slip/ Pretty nice, too! James, have a look at this! >> > > I've been spending the last 2 days weighing ConfigParser and yaml, with > much thought and re-organizing of each file type. The underlying > difference is that, conceptually, ini files are an absurdly limited > subset of yaml in that ini files are basically limited to a map of a map. U have a point here. > > For instance, I have a copy_files section of a configuration. In order > to know what goes with what you have to resort to gymnastics with the > option names > > [copy_files] > files_dir1 = this.file that.file > path_dir1 = /some/path > > files_dir2 = the_other.file yet_another.file > path_dir2 = /some/other/path > > In yaml, it might look thus. > > copy_files : > - files : [this.file, that.file] >path : /some/path > - files : [the_other.file, yet_another.file] >path : /some/other/path > > Both are readable (though I like equals signs in appearance over > colons), but yaml doesn't require a lot of string processing to group > the files with the paths. I don't even want to think the coding > gymnastics required to split all of the option names and then group > those with common suffixes. > > Now if the config file were for copying only, ini would be okay, because > one could just have sections that group paths and dirs: > > [dir1] > files = this.file, that.file > path = /some/path > > [dir2] > ... > > But if you need different kinds of sections, you have outgrown ini. > > In essence, ini is limited to a single dictionary of dictionaries while > yama can express pretty much arbitrary complexity. James, this single formula makes things real clear. As we both work on the subject, maybe we could continue to exchange ideas, and information? Have a look at the link Szabolcs Nagy <[EMAIL PROTECTED]> gives: http://www.scottsweeney.com/projects/slip/ Ill further dig yaml, with 2 questions: - how do I translate to python? - how do I express and/or enforce rules the data should follow? (avoid the classic: configuration data error raise some obscure exception). Big thanks to Szabolcs Nagy (hungarian, my friend? I love this country), although I seem to disagree, your statements are pretty clear and helpful, and... maybe U are right, and I am a fool... Pierre -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
On 1/31/07, James Stroud <[EMAIL PROTECTED]> wrote: > [copy_files] > files_dir1 = this.file that.file > path_dir1 = /some/path > > files_dir2 = the_other.file yet_another.file > path_dir2 = /some/other/path > > In yaml, it might look thus. > > copy_files : > - files : [this.file, that.file] > path : /some/path > - files : [the_other.file, yet_another.file] > path : /some/other/path > > Both are readable (though I like equals signs in appearance over > colons), but yaml doesn't require a lot of string processing to group > the files with the paths. I don't even want to think the coding > gymnastics required to split all of the option names and then group > those with common suffixes. But is not that a perfect world example? Consider: [copy_files] files_dir1=this.file that.file path_dir1=/some/path files_dir2=the_other.file yet_another.file path_dir2=/some/other/path versus: copy_files: -files:[this.file,that.file] path:/some/path -files:[the_other.file,yet_another.file] path:/some/other/path Mandatory indentation is good in programming languages, but does it really belong in configuration files? With tabs verboten to boot. -- mvh Björn -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
BJörn Lindqvist wrote: > On 1/31/07, James Stroud <[EMAIL PROTECTED]> wrote: >> [copy_files] >> files_dir1 = this.file that.file >> path_dir1 = /some/path >> >> files_dir2 = the_other.file yet_another.file >> path_dir2 = /some/other/path >> >> In yaml, it might look thus. >> >> copy_files : >> - files : [this.file, that.file] >> path : /some/path >> - files : [the_other.file, yet_another.file] >> path : /some/other/path >> >> Both are readable (though I like equals signs in appearance over >> colons), but yaml doesn't require a lot of string processing to group >> the files with the paths. I don't even want to think the coding >> gymnastics required to split all of the option names and then group >> those with common suffixes. > > But is not that a perfect world example? Consider: > > [copy_files] > files_dir1=this.file that.file > path_dir1=/some/path > files_dir2=the_other.file yet_another.file > path_dir2=/some/other/path > > versus: > > copy_files: > -files:[this.file,that.file] > path:/some/path > -files:[the_other.file,yet_another.file] > path:/some/other/path > > Mandatory indentation is good in programming languages, but does it > really belong in configuration files? With tabs verboten to boot. > I'm not sure whether to agree with you or disagree with you. My conclusion is that if it is at all possible, try to use an ini file, even if you have to stretch your imagination a bit. More complex formats are prone to one's assigning some imperative meaning to the structure (as I am doing with my example, which might make it a bad one). However, these more complex formats can intensely useful for (1) knowledgeable people with (2) complicated data. -- http://mail.python.org/mailman/listinfo/python-list
Re: data design
James Stroud kirjoitti: > > > > For instance, I have a copy_files section of a configuration. In order > to know what goes with what you have to resort to gymnastics with the > option names > > [copy_files] > files_dir1 = this.file that.file > path_dir1 = /some/path > > files_dir2 = the_other.file yet_another.file > path_dir2 = /some/other/path > > > James You don't have to. With a config file: ### [copy_files] /some/path = this.file that.file C:\a windows\path with spaces= one.1 two.two a_continuation_line_starting_with_a_tab.xyz and_another_starting_with_a_some_spaces.abc /some/other/path = the_other.file yet_another.file ### the following program: ### #!/usr/bin/python import ConfigParser config = ConfigParser.ConfigParser() config.readfp(open(r'ConfigTest.INI')) opts = config.options('copy_files') print opts print 'Files to be copied:' for opt in opts: path = opt optVal = config.get('copy_files', opt) #print opt, optVal fileNames = optVal.split() ### The following lines are only needed for Windows ### because the use of ':' in Windows' file name's ### device part clashes with its use in ConfigParser pathParts = '' for ind in range(len(fileNames)): if fileNames[ind][-1] in ':=': path += ':' + pathParts + fileNames[ind][:-1] del fileNames[:ind+1] break pathParts += fileNames[ind] + ' ' ### Windows dependent section ends print 'Path:', '>' + path + '<' for fn in fileNames: print '>' + fn + '<' ### produces the following output: ### ['c', '/some/other/path', '/some/path'] Files to be copied: Path: >c:\a windows\path with spaces< >one.1< >two.two< >a_continuation_line_starting_with_a_tab.xyz< >and_another_starting_with_a_some_spaces.abc< Path: >/some/other/path< >the_other.file< >yet_another.file< Path: >/some/path< >this.file< >that.file< ### Cheers, Jussi -- http://mail.python.org/mailman/listinfo/python-list