Welcome aboard!

I can answer a few:

1. Yes with some build flags and script tweaking I can help with. I am
running it now.

2. You will have to make startup scripts for windows, and honestly I could
not tell you if Blur would even run in a windows environment.  Have you
considered doing dev in a VM? Or running a VM on your windows machine at
least for hosting the hadoop stack?

3. Are you familiar with lucene itself?  You must query against a column
(ok not 100% true with blur but it seems like you have specified field1=x
field2=y requirements) I am slightly confused with your queries as they
have a mix of column names and values that are in different columns in your
example.
Assuming your first query is cost:50 AND period:Nov13 AND pool1:Tag1 then
sure. If you meant any kind of cost, then you simple omit that from the
query in the first place.
Assuming your second query is (cost:50 OR cost:150) AND period:Dec13 AND
pool1:Tag1 AND pool2:Tag2 then sure that works too.

For the most part, if you can write a pretty standard SQL statement to
query for your data as if it was in a database, that can be duplicated
inside Blur.


Millions of rows will be fine.  A single table with the column names you
have described is fine, you will have to come up with some kind of unique
identifier for each row to load into Blur. (Like a primary key in a
database)

Let me know if you have any more questions. :)

~Garrett


On Thu, Nov 21, 2013 at 5:38 AM, Naresh Yadav <[email protected]> wrote:

> hi,
>
> I am just reading about Apache Blur from last one day..and i found it
> perfect fit for my project. But i have some doubts :
>
> 1. Will i be able to Hadoop 2.0 existing cluster with Apache Blur latest
> version
>
> 2. My development enviornment is Windows and Hadoop 2.0 supports windows
> so   i have doubt will apache blur latest version will work on windows
> smoothly..i will get startup scripts for windows.
>
> 3. Here is 4 rows of my data which i need to store in one table :
>        Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag2
>        Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag3
>        Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag3
>        Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag4
>
>    Query 1 : I need get all rows with
>              Cost, Nov13, Tag1
>    Query 2: get all rows with Cost, Dec13, Tag1, Tag2
>      Will i be able to do perform such query if yes how should i design
> this Blur table for this use case. Note : In this table there can be
> million of rows with all historic data.
>
> Please help me, i am new to big data technologies..Your guidance will give
> me direction to proceed..
>
> Thanks
> Naresh
>

Reply via email to