Re: When would/should I use spark with phoenix?

2016-09-13 Thread dalin.qin
if you are using yarn as the resource negotiator , you will get
container(cpu+memory ) allocated from all the node.   fyi:
http://spark.apache.org/docs/latest/running-on-yarn.html

it'a scalable parallel caculation. Map reduce(phoenix) will do the same
thing just it's way to do the caculation is not smart as spark does.

On Tue, Sep 13, 2016 at 4:16 PM, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

> if I was to use spark (via python api for example), the query would be
> processed on my webservers or on a separate server like in phoenix?
>
> Regards,
>
> Cheyenne Forbes
>
> Chief Executive Officer
> Avapno Omnitech
>
> Chief Operating Officer
> Avapno Solutions, Co.
>
> Chairman
> Avapno Assets, LLC
>
> Bethel Town P.O
> Westmoreland
> Jamaica
>
> Email: cheyenne.osanu.for...@gmail.com
> Mobile: 876-881-7889
> skype: cheyenne.forbes1
>
>
> On Tue, Sep 13, 2016 at 3:07 PM, dalin.qin  wrote:
>
>> Hi Cheyenne ,
>>
>> That's a very interesting question, if secondary indexes are created well
>> on phoenix table , hbase will use coprocessor to do the join operation
>> (java based  map reduce job still if I understand correctly) and then
>> return the result . on the contrary spark is famous for its great
>> improvement vs the traditional m/r operation ,once the two tables are in
>> spark dataframe , I believe spark wins all the time . however it might take
>> long time to load the two big table into spark .
>>
>> I'll do this test in the future,right now our system is quite busy with
>> ALS model tasks.
>>
>> Cheers,
>> Dalin
>>
>> On Tue, Sep 13, 2016 at 3:58 PM, Cheyenne Forbes <
>> cheyenne.osanu.for...@gmail.com> wrote:
>>
>>> i've been thinking, is spark sql faster than phoenix (or phoenix-spark)
>>> with selects with joins on large data (for example instagram's size)?
>>>
>>> Regards,
>>>
>>> Cheyenne Forbes
>>>
>>> Chief Executive Officer
>>> Avapno Omnitech
>>>
>>> Chief Operating Officer
>>> Avapno Solutions, Co.
>>>
>>> Chairman
>>> Avapno Assets, LLC
>>>
>>> Bethel Town P.O
>>> Westmoreland
>>> Jamaica
>>>
>>> Email: cheyenne.osanu.for...@gmail.com
>>> Mobile: 876-881-7889
>>> skype: cheyenne.forbes1
>>>
>>>
>>> On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin 
>>> wrote:
>>>
 Hi Dalin,

 Thanks for the information, I'm glad to hear that the spark integration
 is working well for your use case.

 Josh

 On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin  wrote:

> Hi Josh,
>
> before the project kicked off , we get the idea that hbase is more
> suitable for massive writing rather than batch full table reading(I forgot
> where the idea from ,just some benchmart testing posted in the website
> maybe). So we decide to read hbase only based on primary key for small
> amount of data query request. we store the hbase result in json file 
> either
> as everyday's incremental changes(another benefit from json is you can put
> them in a time based directory so that you could only query part of those
> files), then use spark to read those json files and do the ML model or
> report caculation.
>
> Hope this could help:)
>
> Dalin
>
>
> On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin 
> wrote:
>
>> Hi Dalin,
>>
>> That's great to hear. Have you also tried reading back those rows
>> through Spark for a larger "batch processing" job? Am curious if you have
>> any experiences or insight there from operating on a large dataset.
>>
>> Thanks!
>>
>> Josh
>>
>> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin 
>> wrote:
>>
>>> Hi ,
>>> I've used phoenix table to store billions of rows , rows are
>>> incrementally insert into phoenix by spark every day and the table was 
>>> for
>>> instant query from web page by providing primary key . so far so good .
>>>
>>> Thanks
>>> Dalin
>>>
>>> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
>>> cheyenne.osanu.for...@gmail.com> wrote:
>>>
 Thanks everyone, I will be using phoenix for simple input/output
 and the phoenix_spark plugin (https://phoenix.apache.org/ph
 oenix_spark.html) for more complex queries, is that the smart
 thing?

 Regards,

 Cheyenne Forbes

 Chief Executive Officer
 Avapno Omnitech

 Chief Operating Officer
 Avapno Solutions, Co.

 Chairman
 Avapno Assets, LLC

 Bethel Town P.O
 Westmoreland
 Jamaica

 Email: cheyenne.osanu.for...@gmail.com
 Mobile: 876-881-7889
 skype: cheyenne.forbes1


 On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu 
 wrote:

> w.r.t. Resource Management, Spark also relies on other framework
> such as YARN or Mesos.
>
> Cheers
>
> On Sun, Sep 11,

Re: When would/should I use spark with phoenix?

2016-09-13 Thread Cheyenne Forbes
if I was to use spark (via python api for example), the query would be
processed on my webservers or on a separate server like in phoenix?

Regards,

Cheyenne Forbes

Chief Executive Officer
Avapno Omnitech

Chief Operating Officer
Avapno Solutions, Co.

Chairman
Avapno Assets, LLC

Bethel Town P.O
Westmoreland
Jamaica

Email: cheyenne.osanu.for...@gmail.com
Mobile: 876-881-7889
skype: cheyenne.forbes1


On Tue, Sep 13, 2016 at 3:07 PM, dalin.qin  wrote:

> Hi Cheyenne ,
>
> That's a very interesting question, if secondary indexes are created well
> on phoenix table , hbase will use coprocessor to do the join operation
> (java based  map reduce job still if I understand correctly) and then
> return the result . on the contrary spark is famous for its great
> improvement vs the traditional m/r operation ,once the two tables are in
> spark dataframe , I believe spark wins all the time . however it might take
> long time to load the two big table into spark .
>
> I'll do this test in the future,right now our system is quite busy with
> ALS model tasks.
>
> Cheers,
> Dalin
>
> On Tue, Sep 13, 2016 at 3:58 PM, Cheyenne Forbes <
> cheyenne.osanu.for...@gmail.com> wrote:
>
>> i've been thinking, is spark sql faster than phoenix (or phoenix-spark)
>> with selects with joins on large data (for example instagram's size)?
>>
>> Regards,
>>
>> Cheyenne Forbes
>>
>> Chief Executive Officer
>> Avapno Omnitech
>>
>> Chief Operating Officer
>> Avapno Solutions, Co.
>>
>> Chairman
>> Avapno Assets, LLC
>>
>> Bethel Town P.O
>> Westmoreland
>> Jamaica
>>
>> Email: cheyenne.osanu.for...@gmail.com
>> Mobile: 876-881-7889
>> skype: cheyenne.forbes1
>>
>>
>> On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin  wrote:
>>
>>> Hi Dalin,
>>>
>>> Thanks for the information, I'm glad to hear that the spark integration
>>> is working well for your use case.
>>>
>>> Josh
>>>
>>> On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin  wrote:
>>>
 Hi Josh,

 before the project kicked off , we get the idea that hbase is more
 suitable for massive writing rather than batch full table reading(I forgot
 where the idea from ,just some benchmart testing posted in the website
 maybe). So we decide to read hbase only based on primary key for small
 amount of data query request. we store the hbase result in json file either
 as everyday's incremental changes(another benefit from json is you can put
 them in a time based directory so that you could only query part of those
 files), then use spark to read those json files and do the ML model or
 report caculation.

 Hope this could help:)

 Dalin


 On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin 
 wrote:

> Hi Dalin,
>
> That's great to hear. Have you also tried reading back those rows
> through Spark for a larger "batch processing" job? Am curious if you have
> any experiences or insight there from operating on a large dataset.
>
> Thanks!
>
> Josh
>
> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin 
> wrote:
>
>> Hi ,
>> I've used phoenix table to store billions of rows , rows are
>> incrementally insert into phoenix by spark every day and the table was 
>> for
>> instant query from web page by providing primary key . so far so good .
>>
>> Thanks
>> Dalin
>>
>> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
>> cheyenne.osanu.for...@gmail.com> wrote:
>>
>>> Thanks everyone, I will be using phoenix for simple input/output and
>>> the phoenix_spark plugin (https://phoenix.apache.org/ph
>>> oenix_spark.html) for more complex queries, is that the smart thing?
>>>
>>> Regards,
>>>
>>> Cheyenne Forbes
>>>
>>> Chief Executive Officer
>>> Avapno Omnitech
>>>
>>> Chief Operating Officer
>>> Avapno Solutions, Co.
>>>
>>> Chairman
>>> Avapno Assets, LLC
>>>
>>> Bethel Town P.O
>>> Westmoreland
>>> Jamaica
>>>
>>> Email: cheyenne.osanu.for...@gmail.com
>>> Mobile: 876-881-7889
>>> skype: cheyenne.forbes1
>>>
>>>
>>> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu 
>>> wrote:
>>>
 w.r.t. Resource Management, Spark also relies on other framework
 such as YARN or Mesos.

 Cheers

 On Sun, Sep 11, 2016 at 6:31 AM, John Leach 
 wrote:

> Spark has a robust execution model with the following features
> that are not part of phoenix
> * Scalable
> * fault tolerance with lineage (Handles large intermediate
> results)
> * memory management for tasks
> * Resource Management (Fair Scheduling)
> * Additional SQL Features (Windowing ,etc.)
> * Machine Learning Libraries
>
>
> Regards,
> John
>
> > On Sep 11, 2016, at 2:45 AM, Ch

Re: When would/should I use spark with phoenix?

2016-09-13 Thread dalin.qin
Hi Cheyenne ,

That's a very interesting question, if secondary indexes are created well
on phoenix table , hbase will use coprocessor to do the join operation
(java based  map reduce job still if I understand correctly) and then
return the result . on the contrary spark is famous for its great
improvement vs the traditional m/r operation ,once the two tables are in
spark dataframe , I believe spark wins all the time . however it might take
long time to load the two big table into spark .

I'll do this test in the future,right now our system is quite busy with ALS
model tasks.

Cheers,
Dalin

On Tue, Sep 13, 2016 at 3:58 PM, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

> i've been thinking, is spark sql faster than phoenix (or phoenix-spark)
> with selects with joins on large data (for example instagram's size)?
>
> Regards,
>
> Cheyenne Forbes
>
> Chief Executive Officer
> Avapno Omnitech
>
> Chief Operating Officer
> Avapno Solutions, Co.
>
> Chairman
> Avapno Assets, LLC
>
> Bethel Town P.O
> Westmoreland
> Jamaica
>
> Email: cheyenne.osanu.for...@gmail.com
> Mobile: 876-881-7889
> skype: cheyenne.forbes1
>
>
> On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin  wrote:
>
>> Hi Dalin,
>>
>> Thanks for the information, I'm glad to hear that the spark integration
>> is working well for your use case.
>>
>> Josh
>>
>> On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin  wrote:
>>
>>> Hi Josh,
>>>
>>> before the project kicked off , we get the idea that hbase is more
>>> suitable for massive writing rather than batch full table reading(I forgot
>>> where the idea from ,just some benchmart testing posted in the website
>>> maybe). So we decide to read hbase only based on primary key for small
>>> amount of data query request. we store the hbase result in json file either
>>> as everyday's incremental changes(another benefit from json is you can put
>>> them in a time based directory so that you could only query part of those
>>> files), then use spark to read those json files and do the ML model or
>>> report caculation.
>>>
>>> Hope this could help:)
>>>
>>> Dalin
>>>
>>>
>>> On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin 
>>> wrote:
>>>
 Hi Dalin,

 That's great to hear. Have you also tried reading back those rows
 through Spark for a larger "batch processing" job? Am curious if you have
 any experiences or insight there from operating on a large dataset.

 Thanks!

 Josh

 On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin  wrote:

> Hi ,
> I've used phoenix table to store billions of rows , rows are
> incrementally insert into phoenix by spark every day and the table was for
> instant query from web page by providing primary key . so far so good .
>
> Thanks
> Dalin
>
> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
> cheyenne.osanu.for...@gmail.com> wrote:
>
>> Thanks everyone, I will be using phoenix for simple input/output and
>> the phoenix_spark plugin (https://phoenix.apache.org/ph
>> oenix_spark.html) for more complex queries, is that the smart thing?
>>
>> Regards,
>>
>> Cheyenne Forbes
>>
>> Chief Executive Officer
>> Avapno Omnitech
>>
>> Chief Operating Officer
>> Avapno Solutions, Co.
>>
>> Chairman
>> Avapno Assets, LLC
>>
>> Bethel Town P.O
>> Westmoreland
>> Jamaica
>>
>> Email: cheyenne.osanu.for...@gmail.com
>> Mobile: 876-881-7889
>> skype: cheyenne.forbes1
>>
>>
>> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu  wrote:
>>
>>> w.r.t. Resource Management, Spark also relies on other framework
>>> such as YARN or Mesos.
>>>
>>> Cheers
>>>
>>> On Sun, Sep 11, 2016 at 6:31 AM, John Leach 
>>> wrote:
>>>
 Spark has a robust execution model with the following features that
 are not part of phoenix
 * Scalable
 * fault tolerance with lineage (Handles large intermediate
 results)
 * memory management for tasks
 * Resource Management (Fair Scheduling)
 * Additional SQL Features (Windowing ,etc.)
 * Machine Learning Libraries


 Regards,
 John

 > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
 cheyenne.osanu.for...@gmail.com> wrote:
 >
 > I realized there is a spark plugin for phoenix, any use cases?
 why would I use spark with phoenix instead of phoenix by itself?


>>>
>>
>

>>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-13 Thread Cheyenne Forbes
i've been thinking, is spark sql faster than phoenix (or phoenix-spark)
with selects with joins on large data (for example instagram's size)?

Regards,

Cheyenne Forbes

Chief Executive Officer
Avapno Omnitech

Chief Operating Officer
Avapno Solutions, Co.

Chairman
Avapno Assets, LLC

Bethel Town P.O
Westmoreland
Jamaica

Email: cheyenne.osanu.for...@gmail.com
Mobile: 876-881-7889
skype: cheyenne.forbes1


On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin  wrote:

> Hi Dalin,
>
> Thanks for the information, I'm glad to hear that the spark integration is
> working well for your use case.
>
> Josh
>
> On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin  wrote:
>
>> Hi Josh,
>>
>> before the project kicked off , we get the idea that hbase is more
>> suitable for massive writing rather than batch full table reading(I forgot
>> where the idea from ,just some benchmart testing posted in the website
>> maybe). So we decide to read hbase only based on primary key for small
>> amount of data query request. we store the hbase result in json file either
>> as everyday's incremental changes(another benefit from json is you can put
>> them in a time based directory so that you could only query part of those
>> files), then use spark to read those json files and do the ML model or
>> report caculation.
>>
>> Hope this could help:)
>>
>> Dalin
>>
>>
>> On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin  wrote:
>>
>>> Hi Dalin,
>>>
>>> That's great to hear. Have you also tried reading back those rows
>>> through Spark for a larger "batch processing" job? Am curious if you have
>>> any experiences or insight there from operating on a large dataset.
>>>
>>> Thanks!
>>>
>>> Josh
>>>
>>> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin  wrote:
>>>
 Hi ,
 I've used phoenix table to store billions of rows , rows are
 incrementally insert into phoenix by spark every day and the table was for
 instant query from web page by providing primary key . so far so good .

 Thanks
 Dalin

 On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
 cheyenne.osanu.for...@gmail.com> wrote:

> Thanks everyone, I will be using phoenix for simple input/output and
> the phoenix_spark plugin (https://phoenix.apache.org/ph
> oenix_spark.html) for more complex queries, is that the smart thing?
>
> Regards,
>
> Cheyenne Forbes
>
> Chief Executive Officer
> Avapno Omnitech
>
> Chief Operating Officer
> Avapno Solutions, Co.
>
> Chairman
> Avapno Assets, LLC
>
> Bethel Town P.O
> Westmoreland
> Jamaica
>
> Email: cheyenne.osanu.for...@gmail.com
> Mobile: 876-881-7889
> skype: cheyenne.forbes1
>
>
> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu  wrote:
>
>> w.r.t. Resource Management, Spark also relies on other framework
>> such as YARN or Mesos.
>>
>> Cheers
>>
>> On Sun, Sep 11, 2016 at 6:31 AM, John Leach 
>> wrote:
>>
>>> Spark has a robust execution model with the following features that
>>> are not part of phoenix
>>> * Scalable
>>> * fault tolerance with lineage (Handles large intermediate
>>> results)
>>> * memory management for tasks
>>> * Resource Management (Fair Scheduling)
>>> * Additional SQL Features (Windowing ,etc.)
>>> * Machine Learning Libraries
>>>
>>>
>>> Regards,
>>> John
>>>
>>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
>>> cheyenne.osanu.for...@gmail.com> wrote:
>>> >
>>> > I realized there is a spark plugin for phoenix, any use cases? why
>>> would I use spark with phoenix instead of phoenix by itself?
>>>
>>>
>>
>

>>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-13 Thread Josh Mahonin
Hi Dalin,

Thanks for the information, I'm glad to hear that the spark integration is
working well for your use case.

Josh

On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin  wrote:

> Hi Josh,
>
> before the project kicked off , we get the idea that hbase is more
> suitable for massive writing rather than batch full table reading(I forgot
> where the idea from ,just some benchmart testing posted in the website
> maybe). So we decide to read hbase only based on primary key for small
> amount of data query request. we store the hbase result in json file either
> as everyday's incremental changes(another benefit from json is you can put
> them in a time based directory so that you could only query part of those
> files), then use spark to read those json files and do the ML model or
> report caculation.
>
> Hope this could help:)
>
> Dalin
>
>
> On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin  wrote:
>
>> Hi Dalin,
>>
>> That's great to hear. Have you also tried reading back those rows through
>> Spark for a larger "batch processing" job? Am curious if you have any
>> experiences or insight there from operating on a large dataset.
>>
>> Thanks!
>>
>> Josh
>>
>> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin  wrote:
>>
>>> Hi ,
>>> I've used phoenix table to store billions of rows , rows are
>>> incrementally insert into phoenix by spark every day and the table was for
>>> instant query from web page by providing primary key . so far so good .
>>>
>>> Thanks
>>> Dalin
>>>
>>> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
>>> cheyenne.osanu.for...@gmail.com> wrote:
>>>
 Thanks everyone, I will be using phoenix for simple input/output and
 the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html)
 for more complex queries, is that the smart thing?

 Regards,

 Cheyenne Forbes

 Chief Executive Officer
 Avapno Omnitech

 Chief Operating Officer
 Avapno Solutions, Co.

 Chairman
 Avapno Assets, LLC

 Bethel Town P.O
 Westmoreland
 Jamaica

 Email: cheyenne.osanu.for...@gmail.com
 Mobile: 876-881-7889
 skype: cheyenne.forbes1


 On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu  wrote:

> w.r.t. Resource Management, Spark also relies on other framework such
> as YARN or Mesos.
>
> Cheers
>
> On Sun, Sep 11, 2016 at 6:31 AM, John Leach  wrote:
>
>> Spark has a robust execution model with the following features that
>> are not part of phoenix
>> * Scalable
>> * fault tolerance with lineage (Handles large intermediate
>> results)
>> * memory management for tasks
>> * Resource Management (Fair Scheduling)
>> * Additional SQL Features (Windowing ,etc.)
>> * Machine Learning Libraries
>>
>>
>> Regards,
>> John
>>
>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
>> cheyenne.osanu.for...@gmail.com> wrote:
>> >
>> > I realized there is a spark plugin for phoenix, any use cases? why
>> would I use spark with phoenix instead of phoenix by itself?
>>
>>
>

>>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-12 Thread dalin.qin
Hi Josh,

before the project kicked off , we get the idea that hbase is more suitable
for massive writing rather than batch full table reading(I forgot where the
idea from ,just some benchmart testing posted in the website maybe). So we
decide to read hbase only based on primary key for small amount of data
query request. we store the hbase result in json file either as everyday's
incremental changes(another benefit from json is you can put them in a time
based directory so that you could only query part of those files), then use
spark to read those json files and do the ML model or report caculation.

Hope this could help:)

Dalin


On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin  wrote:

> Hi Dalin,
>
> That's great to hear. Have you also tried reading back those rows through
> Spark for a larger "batch processing" job? Am curious if you have any
> experiences or insight there from operating on a large dataset.
>
> Thanks!
>
> Josh
>
> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin  wrote:
>
>> Hi ,
>> I've used phoenix table to store billions of rows , rows are
>> incrementally insert into phoenix by spark every day and the table was for
>> instant query from web page by providing primary key . so far so good .
>>
>> Thanks
>> Dalin
>>
>> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
>> cheyenne.osanu.for...@gmail.com> wrote:
>>
>>> Thanks everyone, I will be using phoenix for simple input/output and
>>> the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html)
>>> for more complex queries, is that the smart thing?
>>>
>>> Regards,
>>>
>>> Cheyenne Forbes
>>>
>>> Chief Executive Officer
>>> Avapno Omnitech
>>>
>>> Chief Operating Officer
>>> Avapno Solutions, Co.
>>>
>>> Chairman
>>> Avapno Assets, LLC
>>>
>>> Bethel Town P.O
>>> Westmoreland
>>> Jamaica
>>>
>>> Email: cheyenne.osanu.for...@gmail.com
>>> Mobile: 876-881-7889
>>> skype: cheyenne.forbes1
>>>
>>>
>>> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu  wrote:
>>>
 w.r.t. Resource Management, Spark also relies on other framework such
 as YARN or Mesos.

 Cheers

 On Sun, Sep 11, 2016 at 6:31 AM, John Leach  wrote:

> Spark has a robust execution model with the following features that
> are not part of phoenix
> * Scalable
> * fault tolerance with lineage (Handles large intermediate
> results)
> * memory management for tasks
> * Resource Management (Fair Scheduling)
> * Additional SQL Features (Windowing ,etc.)
> * Machine Learning Libraries
>
>
> Regards,
> John
>
> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
> cheyenne.osanu.for...@gmail.com> wrote:
> >
> > I realized there is a spark plugin for phoenix, any use cases? why
> would I use spark with phoenix instead of phoenix by itself?
>
>

>>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-12 Thread Josh Mahonin
Hi Dalin,

That's great to hear. Have you also tried reading back those rows through
Spark for a larger "batch processing" job? Am curious if you have any
experiences or insight there from operating on a large dataset.

Thanks!

Josh

On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin  wrote:

> Hi ,
> I've used phoenix table to store billions of rows , rows are incrementally
> insert into phoenix by spark every day and the table was for instant query
> from web page by providing primary key . so far so good .
>
> Thanks
> Dalin
>
> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
> cheyenne.osanu.for...@gmail.com> wrote:
>
>> Thanks everyone, I will be using phoenix for simple input/output and
>> the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html)
>> for more complex queries, is that the smart thing?
>>
>> Regards,
>>
>> Cheyenne Forbes
>>
>> Chief Executive Officer
>> Avapno Omnitech
>>
>> Chief Operating Officer
>> Avapno Solutions, Co.
>>
>> Chairman
>> Avapno Assets, LLC
>>
>> Bethel Town P.O
>> Westmoreland
>> Jamaica
>>
>> Email: cheyenne.osanu.for...@gmail.com
>> Mobile: 876-881-7889
>> skype: cheyenne.forbes1
>>
>>
>> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu  wrote:
>>
>>> w.r.t. Resource Management, Spark also relies on other framework such
>>> as YARN or Mesos.
>>>
>>> Cheers
>>>
>>> On Sun, Sep 11, 2016 at 6:31 AM, John Leach  wrote:
>>>
 Spark has a robust execution model with the following features that are
 not part of phoenix
 * Scalable
 * fault tolerance with lineage (Handles large intermediate
 results)
 * memory management for tasks
 * Resource Management (Fair Scheduling)
 * Additional SQL Features (Windowing ,etc.)
 * Machine Learning Libraries


 Regards,
 John

 > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
 cheyenne.osanu.for...@gmail.com> wrote:
 >
 > I realized there is a spark plugin for phoenix, any use cases? why
 would I use spark with phoenix instead of phoenix by itself?


>>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-12 Thread dalin.qin
Hi ,
I've used phoenix table to store billions of rows , rows are incrementally
insert into phoenix by spark every day and the table was for instant query
from web page by providing primary key . so far so good .

Thanks
Dalin

On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

> Thanks everyone, I will be using phoenix for simple input/output and
> the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html)
> for more complex queries, is that the smart thing?
>
> Regards,
>
> Cheyenne Forbes
>
> Chief Executive Officer
> Avapno Omnitech
>
> Chief Operating Officer
> Avapno Solutions, Co.
>
> Chairman
> Avapno Assets, LLC
>
> Bethel Town P.O
> Westmoreland
> Jamaica
>
> Email: cheyenne.osanu.for...@gmail.com
> Mobile: 876-881-7889
> skype: cheyenne.forbes1
>
>
> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu  wrote:
>
>> w.r.t. Resource Management, Spark also relies on other framework such as
>> YARN or Mesos.
>>
>> Cheers
>>
>> On Sun, Sep 11, 2016 at 6:31 AM, John Leach  wrote:
>>
>>> Spark has a robust execution model with the following features that are
>>> not part of phoenix
>>> * Scalable
>>> * fault tolerance with lineage (Handles large intermediate
>>> results)
>>> * memory management for tasks
>>> * Resource Management (Fair Scheduling)
>>> * Additional SQL Features (Windowing ,etc.)
>>> * Machine Learning Libraries
>>>
>>>
>>> Regards,
>>> John
>>>
>>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
>>> cheyenne.osanu.for...@gmail.com> wrote:
>>> >
>>> > I realized there is a spark plugin for phoenix, any use cases? why
>>> would I use spark with phoenix instead of phoenix by itself?
>>>
>>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-12 Thread Cheyenne Forbes
Thanks everyone, I will be using phoenix for simple input/output and
the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html)
for more complex queries, is that the smart thing?

Regards,

Cheyenne Forbes

Chief Executive Officer
Avapno Omnitech

Chief Operating Officer
Avapno Solutions, Co.

Chairman
Avapno Assets, LLC

Bethel Town P.O
Westmoreland
Jamaica

Email: cheyenne.osanu.for...@gmail.com
Mobile: 876-881-7889
skype: cheyenne.forbes1


On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu  wrote:

> w.r.t. Resource Management, Spark also relies on other framework such as
> YARN or Mesos.
>
> Cheers
>
> On Sun, Sep 11, 2016 at 6:31 AM, John Leach  wrote:
>
>> Spark has a robust execution model with the following features that are
>> not part of phoenix
>> * Scalable
>> * fault tolerance with lineage (Handles large intermediate
>> results)
>> * memory management for tasks
>> * Resource Management (Fair Scheduling)
>> * Additional SQL Features (Windowing ,etc.)
>> * Machine Learning Libraries
>>
>>
>> Regards,
>> John
>>
>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
>> cheyenne.osanu.for...@gmail.com> wrote:
>> >
>> > I realized there is a spark plugin for phoenix, any use cases? why
>> would I use spark with phoenix instead of phoenix by itself?
>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-11 Thread Ted Yu
w.r.t. Resource Management, Spark also relies on other framework such as
YARN or Mesos.

Cheers

On Sun, Sep 11, 2016 at 6:31 AM, John Leach  wrote:

> Spark has a robust execution model with the following features that are
> not part of phoenix
> * Scalable
> * fault tolerance with lineage (Handles large intermediate results)
> * memory management for tasks
> * Resource Management (Fair Scheduling)
> * Additional SQL Features (Windowing ,etc.)
> * Machine Learning Libraries
>
>
> Regards,
> John
>
> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
> cheyenne.osanu.for...@gmail.com> wrote:
> >
> > I realized there is a spark plugin for phoenix, any use cases? why would
> I use spark with phoenix instead of phoenix by itself?
>
>


Re: When would/should I use spark with phoenix?

2016-09-11 Thread Josh Mahonin
Just to add to James' comment, they're indeed complementary and it all
comes down to your own use case. Phoenix offers a convenient SQL interface
over HBase, which is capable of doing very fast queries. If you're just
doing insert / retrieval, it's unlikely that Spark will help you much there.

However, if you have requirements to do some of the types of "big data
processing" that Spark excels at, such as graph algorithms or machine
learning, the plugin allows you to access the data in Phoenix+HBase.

Good luck,

Josh

On Sun, Sep 11, 2016 at 11:12 AM, James Taylor 
wrote:

> It's not an either/or with Phoenix and Spark - often companies use both as
> they're very complementary. See this [1] blog for an example. Spark is a
> processing engine while Phoenix+HBase is a database/store. You'll need to
> store your data somewhere.
> Thanks,
> James
>
> [1] http://tech.marinsoftware.com/nosql/digital-advertising-
> storage-on-apache-hbase-and-apache-phoenix/?platform=hootsuite
>
>
> On Sunday, September 11, 2016, Cheyenne Forbes <
> cheyenne.osanu.for...@gmail.com> wrote:
>
>> Thank you. For a project as big as Facebook or Snapschat, would you
>> recommend using Spark or Phoenix for things such as message
>> retrieval/insert, user search, user feeds retrieval/insert, etc. and what
>> are the pros and cons?
>>
>> Regard,
>> Cheyenne
>>
>>
>> On Sun, Sep 11, 2016 at 8:31 AM, John Leach  wrote:
>>
>>> Spark has a robust execution model with the following features that are
>>> not part of phoenix
>>> * Scalable
>>> * fault tolerance with lineage (Handles large intermediate
>>> results)
>>> * memory management for tasks
>>> * Resource Management (Fair Scheduling)
>>> * Additional SQL Features (Windowing ,etc.)
>>> * Machine Learning Libraries
>>>
>>>
>>> Regards,
>>> John
>>>
>>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
>>> cheyenne.osanu.for...@gmail.com> wrote:
>>> >
>>> > I realized there is a spark plugin for phoenix, any use cases? why
>>> would I use spark with phoenix instead of phoenix by itself?
>>>
>>>
>>


Re: When would/should I use spark with phoenix?

2016-09-11 Thread James Taylor
It's not an either/or with Phoenix and Spark - often companies use both as
they're very complementary. See this [1] blog for an example. Spark is a
processing engine while Phoenix+HBase is a database/store. You'll need to
store your data somewhere.
Thanks,
James

[1]
http://tech.marinsoftware.com/nosql/digital-advertising-storage-on-apache-hbase-and-apache-phoenix/?platform=hootsuite

On Sunday, September 11, 2016, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

> Thank you. For a project as big as Facebook or Snapschat, would you
> recommend using Spark or Phoenix for things such as message
> retrieval/insert, user search, user feeds retrieval/insert, etc. and what
> are the pros and cons?
>
> Regard,
> Cheyenne
>
>
> On Sun, Sep 11, 2016 at 8:31 AM, John Leach  > wrote:
>
>> Spark has a robust execution model with the following features that are
>> not part of phoenix
>> * Scalable
>> * fault tolerance with lineage (Handles large intermediate
>> results)
>> * memory management for tasks
>> * Resource Management (Fair Scheduling)
>> * Additional SQL Features (Windowing ,etc.)
>> * Machine Learning Libraries
>>
>>
>> Regards,
>> John
>>
>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
>> cheyenne.osanu.for...@gmail.com
>> > wrote:
>> >
>> > I realized there is a spark plugin for phoenix, any use cases? why
>> would I use spark with phoenix instead of phoenix by itself?
>>
>>
>


Re: When would/should I use spark with phoenix?

2016-09-11 Thread Cheyenne Forbes
Thank you. For a project as big as Facebook or Snapschat, would you
recommend using Spark or Phoenix for things such as message
retrieval/insert, user search, user feeds retrieval/insert, etc. and what
are the pros and cons?

Regard,
Cheyenne


On Sun, Sep 11, 2016 at 8:31 AM, John Leach  wrote:

> Spark has a robust execution model with the following features that are
> not part of phoenix
> * Scalable
> * fault tolerance with lineage (Handles large intermediate results)
> * memory management for tasks
> * Resource Management (Fair Scheduling)
> * Additional SQL Features (Windowing ,etc.)
> * Machine Learning Libraries
>
>
> Regards,
> John
>
> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
> cheyenne.osanu.for...@gmail.com> wrote:
> >
> > I realized there is a spark plugin for phoenix, any use cases? why would
> I use spark with phoenix instead of phoenix by itself?
>
>


Re: When would/should I use spark with phoenix?

2016-09-11 Thread John Leach
Spark has a robust execution model with the following features that are not 
part of phoenix
* Scalable
* fault tolerance with lineage (Handles large intermediate results)
* memory management for tasks 
* Resource Management (Fair Scheduling)
* Additional SQL Features (Windowing ,etc.)
* Machine Learning Libraries


Regards,
John

> On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes 
>  wrote:
> 
> I realized there is a spark plugin for phoenix, any use cases? why would I 
> use spark with phoenix instead of phoenix by itself?



When would/should I use spark with phoenix?

2016-09-11 Thread Cheyenne Forbes
I realized there is a spark plugin for phoenix, any use cases? why would I
use spark with phoenix instead of phoenix by itself?