Naresh, As far as Windows os is concerned may be you can installl cygwin and try running blur shell. I haven't tried but this should work.
Regards, Gagan On Fri, Nov 22, 2013 at 12:25 AM, Garrett Barton <[email protected]> wrote: > Naresh, > > I understand your problem set better now. > > As far as a data structure I would define the following fields within a > family called data: > > data.measure (Type String, not fieldLessIndexed ) > data.period (Type String, not fieldLessIndexed ) > data.pool1..n (Type String, not fieldLessIndexed ) > data.tags (Type String) > data.cost (Type Long,not fieldLessIndexed ) > > If you were in the Blur shell you would do something like: > create -t myTable -c 4 > definecolumn myTable data measure String > definecolumn myTable data period String > definecolumn myTable data tags String > definecolumn myTable data cost Long > > This will let you create as many pool columns as you want, and when you > retrieve the row you will get the titles back by virtue of the column > names. When you query against a tag, you would query against tags field > where you have also loaded your pool data into. > So rewriting your queries into a working Blur query (assuming your family > is called 'data', i dont know exactly what your working on so I'm sure you > could come up with a better name) would look like: > > Query 1 : I need get all rows with > data.measure:Cost AND data.period:Nov13 AND data.tags:Tag1 > O/P = Row1, Row3 > Query 2: get all rows with > data.measure:Cost AND data.period:Dec13 AND data.tags:Tag1 AND > data.tags:TagA O/P = Row4, Row5 > > > As far as getting to work in windows I wouldn't wait for that to happen too > soon. If you download any favorite linux distro live install, install > virtualbox, and download the latest release of Blur you could be running in > under an hour (depending on bandwidth). > > I will reply to the JIRA ticket about Hadoop 2.x with my mods soon, > hopefully with a patch to make things work. > > Take it easy, > ~Garrett > > > > On Thu, Nov 21, 2013 at 12:48 PM, Naresh Yadav <[email protected]> wrote: > >> Hi >> , >> Thanks much Garrett for guiding me, that was really helpful.. >> >> For Doubt *1* i will definitely need your help once i start trying >> installation..Please share document on this if possible. >> >> For Doubt *2* i think will be able to manage with VM, will explore that, it >> would have been better for me if somebody already installed on windows by >> making bat files so that i can also reuse that. >> >> For Doubt *3* my actual case is like this (assume these as rows in excel >> sheet that is how my data will be) : >> >> Row1 : Measure=Cost, Period=Nov13, Pool1=Tag1, Pool2=TagA, Cost=50 >> Row2 : Measure=Cost, Period=Nov13, Pool1=Tag2, Pool2=TagB , Cost=20 >> Row3 : Measure=Cost, Period=Nov13, Pool1=Tag1, Cost=20 >> Row4 : Measure=Cost, Period=Dec13, Pool1=Tag1, Pool2=TagA, Pool3=TagP, >> Cost=150 >> Row5 : Measure=Cost, Period=Dec13, Pool1=Tag1, Pool2=TagA, Pool4=TagQ, >> Cost=170 >> Row6 : Measure=Cost, Period=Dec13, Pool5=Tag1, Cost=120 >> >> Query 1 : I need get all rows with >> Measure:Cost, Period:Nov13, Tag1 O/P = Row1, >> Row3 >> Query 2: get all rows with >> Measure:Cost, Period:Dec13, Tag1, TagA O/P = Row4, Row5 >> So challenge for me is Tag parts as there are varying with rows and also >> while querying on them i will not have >> knowledge of their column/pool names just N tags i can have in any row... >> >> Will such querying will be supported OR Suggest better data model of >> storage of this case. >> >> Naresh >> >> On Thu, Nov 21, 2013 at 8:42 PM, Garrett Barton <[email protected] >> >wrote: >> >> > Welcome aboard! >> > >> > I can answer a few: >> > >> > 1. Yes with some build flags and script tweaking I can help with. I am >> > running it now. >> > >> > 2. You will have to make startup scripts for windows, and honestly I >> could >> > not tell you if Blur would even run in a windows environment. Have you >> > considered doing dev in a VM? Or running a VM on your windows machine at >> > least for hosting the hadoop stack? >> > >> > 3. Are you familiar with lucene itself? You must query against a column >> > (ok not 100% true with blur but it seems like you have specified field1=x >> > field2=y requirements) I am slightly confused with your queries as they >> > have a mix of column names and values that are in different columns in >> your >> > example. >> > Assuming your first query is cost:50 AND period:Nov13 AND pool1:Tag1 then >> > sure. If you meant any kind of cost, then you simple omit that from the >> > query in the first place. >> > Assuming your second query is (cost:50 OR cost:150) AND period:Dec13 AND >> > pool1:Tag1 AND pool2:Tag2 then sure that works too. >> > >> > For the most part, if you can write a pretty standard SQL statement to >> > query for your data as if it was in a database, that can be duplicated >> > inside Blur. >> > >> > >> > Millions of rows will be fine. A single table with the column names you >> > have described is fine, you will have to come up with some kind of unique >> > identifier for each row to load into Blur. (Like a primary key in a >> > database) >> > >> > Let me know if you have any more questions. :) >> > >> > ~Garrett >> > >> > >> > On Thu, Nov 21, 2013 at 5:38 AM, Naresh Yadav <[email protected]> >> > wrote: >> > >> > > hi, >> > > >> > > I am just reading about Apache Blur from last one day..and i found it >> > > perfect fit for my project. But i have some doubts : >> > > >> > > 1. Will i be able to Hadoop 2.0 existing cluster with Apache Blur >> latest >> > > version >> > > >> > > 2. My development enviornment is Windows and Hadoop 2.0 supports >> windows >> > > so i have doubt will apache blur latest version will work on windows >> > > smoothly..i will get startup scripts for windows. >> > > >> > > 3. Here is 4 rows of my data which i need to store in one table : >> > > Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag2 >> > > Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag3 >> > > Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag3 >> > > Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag4 >> > > >> > > Query 1 : I need get all rows with >> > > Cost, Nov13, Tag1 >> > > Query 2: get all rows with Cost, Dec13, Tag1, Tag2 >> > > Will i be able to do perform such query if yes how should i design >> > > this Blur table for this use case. Note : In this table there can be >> > > million of rows with all historic data. >> > > >> > > Please help me, i am new to big data technologies..Your guidance will >> > give >> > > me direction to proceed.. >> > > >> > > Thanks >> > > Naresh >> > > >> > >>
