That is what i am doing as the SQL dumps were too large. I was going to Map the XML to tables and columns to generate the SQL.
Stefan Von meinem iPad gesendet Am 19.01.2013 um 23:21 schrieb "Jacques Nadeau" <[email protected]>: > Stefan, one other thought. It might also be interesting to explore working > with the XML representation of the Wikipedia data to push the nested data > requirements. > > Jacques > > On Sat, Jan 19, 2013 at 10:51 AM, Jacques Nadeau > <[email protected]>wrote: > >> >>> * I drew a UML diagram. I saw that there is some glifffy support in >>> confluenc,e but the free account is pretty much useless. I used omni >>> graffle to draw the diagram, but this is payware on the mac - is there some >>> usable freeware alternative? Don't mention tigris :-) >> >> I don't have any suggestions on this. >>> * I have some ideas on the queries, but I am not sure how I should >>> specify them? Should I use pseudo SQL? Prose? I saw the syntax document on >>> the server, it it mature enough, that I attempt to use its syntax? Is there >>> a BNF or better ANTLR grammar I can use to check my syntax? Should I draw >>> one up while I am at it? >> >> I suggest you target SQL2003 (including subqueries). We're looking at how >> to use Optiq's SQL parser for Drill. Our goal is to stay as close as >> possible to that spec but add the following extensions: >> - Add flatten operator similar to BigQuery syntax. >> - Support use of selection and output identifiers using dotted/bracketed >> notation. E.g. "select person.children[0].age as >> output.profile.firstChildAge" >> - Support new functions that can accept nested values including >> collections and maps. For example "select ARRAY_LENGTH(person.children)". >> >> Once you have some sql examples, the next goal would be to manually >> translate those into Logical Plan syntax. This syntax is still maturing so >> I'd take it to the SQL stage first. >> >> >> >>> >>> >>> >>> Stefan >>> >>> >>> >>> On 19.01.2013, at 02:05, Jacques Nadeau <[email protected]> wrote: >>> >>>> The wiki is up. Michael and Stefan, it would be great if you started >>>> putting your use case thoughts there. >>>> >>>> Jacques >>>> >>>> On Sun, Jan 13, 2013 at 3:31 PM, Ted Dunning <[email protected]> >>> wrote: >>>> >>>>> Ahh... yes. That wiki. I will ping infra again. >>>>> >>>>> (I was attaching your comment to the wikipedia use case and had >>> confused >>>>> myself) >>>>> >>>>> On Sun, Jan 13, 2013 at 2:53 PM, Michael Hausenblas < >>>>> [email protected]> wrote: >>>>> >>>>>> >>>>>>> What do you need from me? >>>>>> >>>>>> Maybe I've overlooked something in which case I apologize - was >>> wondering >>>>>> if the public Wiki for Drill is available where Stefan, I and others >>> can >>>>>> write up the UC and queries. >>>>>> >>>>>> Cheers, >>>>>> Michael >>>>>> >>>>>> -- >>>>>> Michael Hausenblas >>>>>> Ireland, Europe >>>>>> http://mhausenblas.info/ >>>>>> >>>>>> On 13 Jan 2013, at 14:20, Ted Dunning <[email protected]> wrote: >>>>>> >>>>>>> What do you need from me? >>>>>>> >>>>>>> >>>>>>> On Sun, Jan 13, 2013 at 11:06 AM, Michael Hausenblas < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> as soon as we hear back from Ted re the Wiki we work there. >>
