Re: When would/should I use spark with phoenix?
if you are using yarn as the resource negotiator , you will get container(cpu+memory ) allocated from all the node. fyi: http://spark.apache.org/docs/latest/running-on-yarn.html it'a scalable parallel caculation. Map reduce(phoenix) will do the same thing just it's way to do the caculation is not smart as spark does. On Tue, Sep 13, 2016 at 4:16 PM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > if I was to use spark (via python api for example), the query would be > processed on my webservers or on a separate server like in phoenix? > > Regards, > > Cheyenne Forbes > > Chief Executive Officer > Avapno Omnitech > > Chief Operating Officer > Avapno Solutions, Co. > > Chairman > Avapno Assets, LLC > > Bethel Town P.O > Westmoreland > Jamaica > > Email: cheyenne.osanu.for...@gmail.com > Mobile: 876-881-7889 > skype: cheyenne.forbes1 > > > On Tue, Sep 13, 2016 at 3:07 PM, dalin.qin wrote: > >> Hi Cheyenne , >> >> That's a very interesting question, if secondary indexes are created well >> on phoenix table , hbase will use coprocessor to do the join operation >> (java based map reduce job still if I understand correctly) and then >> return the result . on the contrary spark is famous for its great >> improvement vs the traditional m/r operation ,once the two tables are in >> spark dataframe , I believe spark wins all the time . however it might take >> long time to load the two big table into spark . >> >> I'll do this test in the future,right now our system is quite busy with >> ALS model tasks. >> >> Cheers, >> Dalin >> >> On Tue, Sep 13, 2016 at 3:58 PM, Cheyenne Forbes < >> cheyenne.osanu.for...@gmail.com> wrote: >> >>> i've been thinking, is spark sql faster than phoenix (or phoenix-spark) >>> with selects with joins on large data (for example instagram's size)? >>> >>> Regards, >>> >>> Cheyenne Forbes >>> >>> Chief Executive Officer >>> Avapno Omnitech >>> >>> Chief Operating Officer >>> Avapno Solutions, Co. >>> >>> Chairman >>> Avapno Assets, LLC >>> >>> Bethel Town P.O >>> Westmoreland >>> Jamaica >>> >>> Email: cheyenne.osanu.for...@gmail.com >>> Mobile: 876-881-7889 >>> skype: cheyenne.forbes1 >>> >>> >>> On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin >>> wrote: >>> Hi Dalin, Thanks for the information, I'm glad to hear that the spark integration is working well for your use case. Josh On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin wrote: > Hi Josh, > > before the project kicked off , we get the idea that hbase is more > suitable for massive writing rather than batch full table reading(I forgot > where the idea from ,just some benchmart testing posted in the website > maybe). So we decide to read hbase only based on primary key for small > amount of data query request. we store the hbase result in json file > either > as everyday's incremental changes(another benefit from json is you can put > them in a time based directory so that you could only query part of those > files), then use spark to read those json files and do the ML model or > report caculation. > > Hope this could help:) > > Dalin > > > On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin > wrote: > >> Hi Dalin, >> >> That's great to hear. Have you also tried reading back those rows >> through Spark for a larger "batch processing" job? Am curious if you have >> any experiences or insight there from operating on a large dataset. >> >> Thanks! >> >> Josh >> >> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin >> wrote: >> >>> Hi , >>> I've used phoenix table to store billions of rows , rows are >>> incrementally insert into phoenix by spark every day and the table was >>> for >>> instant query from web page by providing primary key . so far so good . >>> >>> Thanks >>> Dalin >>> >>> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < >>> cheyenne.osanu.for...@gmail.com> wrote: >>> Thanks everyone, I will be using phoenix for simple input/output and the phoenix_spark plugin (https://phoenix.apache.org/ph oenix_spark.html) for more complex queries, is that the smart thing? Regards, Cheyenne Forbes Chief Executive Officer Avapno Omnitech Chief Operating Officer Avapno Solutions, Co. Chairman Avapno Assets, LLC Bethel Town P.O Westmoreland Jamaica Email: cheyenne.osanu.for...@gmail.com Mobile: 876-881-7889 skype: cheyenne.forbes1 On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: > w.r.t. Resource Management, Spark also relies on other framework > such as YARN or Mesos. > > Cheers > > On Sun, Sep 11,
Re: When would/should I use spark with phoenix?
if I was to use spark (via python api for example), the query would be processed on my webservers or on a separate server like in phoenix? Regards, Cheyenne Forbes Chief Executive Officer Avapno Omnitech Chief Operating Officer Avapno Solutions, Co. Chairman Avapno Assets, LLC Bethel Town P.O Westmoreland Jamaica Email: cheyenne.osanu.for...@gmail.com Mobile: 876-881-7889 skype: cheyenne.forbes1 On Tue, Sep 13, 2016 at 3:07 PM, dalin.qin wrote: > Hi Cheyenne , > > That's a very interesting question, if secondary indexes are created well > on phoenix table , hbase will use coprocessor to do the join operation > (java based map reduce job still if I understand correctly) and then > return the result . on the contrary spark is famous for its great > improvement vs the traditional m/r operation ,once the two tables are in > spark dataframe , I believe spark wins all the time . however it might take > long time to load the two big table into spark . > > I'll do this test in the future,right now our system is quite busy with > ALS model tasks. > > Cheers, > Dalin > > On Tue, Sep 13, 2016 at 3:58 PM, Cheyenne Forbes < > cheyenne.osanu.for...@gmail.com> wrote: > >> i've been thinking, is spark sql faster than phoenix (or phoenix-spark) >> with selects with joins on large data (for example instagram's size)? >> >> Regards, >> >> Cheyenne Forbes >> >> Chief Executive Officer >> Avapno Omnitech >> >> Chief Operating Officer >> Avapno Solutions, Co. >> >> Chairman >> Avapno Assets, LLC >> >> Bethel Town P.O >> Westmoreland >> Jamaica >> >> Email: cheyenne.osanu.for...@gmail.com >> Mobile: 876-881-7889 >> skype: cheyenne.forbes1 >> >> >> On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin wrote: >> >>> Hi Dalin, >>> >>> Thanks for the information, I'm glad to hear that the spark integration >>> is working well for your use case. >>> >>> Josh >>> >>> On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin wrote: >>> Hi Josh, before the project kicked off , we get the idea that hbase is more suitable for massive writing rather than batch full table reading(I forgot where the idea from ,just some benchmart testing posted in the website maybe). So we decide to read hbase only based on primary key for small amount of data query request. we store the hbase result in json file either as everyday's incremental changes(another benefit from json is you can put them in a time based directory so that you could only query part of those files), then use spark to read those json files and do the ML model or report caculation. Hope this could help:) Dalin On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin wrote: > Hi Dalin, > > That's great to hear. Have you also tried reading back those rows > through Spark for a larger "batch processing" job? Am curious if you have > any experiences or insight there from operating on a large dataset. > > Thanks! > > Josh > > On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin > wrote: > >> Hi , >> I've used phoenix table to store billions of rows , rows are >> incrementally insert into phoenix by spark every day and the table was >> for >> instant query from web page by providing primary key . so far so good . >> >> Thanks >> Dalin >> >> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < >> cheyenne.osanu.for...@gmail.com> wrote: >> >>> Thanks everyone, I will be using phoenix for simple input/output and >>> the phoenix_spark plugin (https://phoenix.apache.org/ph >>> oenix_spark.html) for more complex queries, is that the smart thing? >>> >>> Regards, >>> >>> Cheyenne Forbes >>> >>> Chief Executive Officer >>> Avapno Omnitech >>> >>> Chief Operating Officer >>> Avapno Solutions, Co. >>> >>> Chairman >>> Avapno Assets, LLC >>> >>> Bethel Town P.O >>> Westmoreland >>> Jamaica >>> >>> Email: cheyenne.osanu.for...@gmail.com >>> Mobile: 876-881-7889 >>> skype: cheyenne.forbes1 >>> >>> >>> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu >>> wrote: >>> w.r.t. Resource Management, Spark also relies on other framework such as YARN or Mesos. Cheers On Sun, Sep 11, 2016 at 6:31 AM, John Leach wrote: > Spark has a robust execution model with the following features > that are not part of phoenix > * Scalable > * fault tolerance with lineage (Handles large intermediate > results) > * memory management for tasks > * Resource Management (Fair Scheduling) > * Additional SQL Features (Windowing ,etc.) > * Machine Learning Libraries > > > Regards, > John > > > On Sep 11, 2016, at 2:45 AM, Ch
Re: When would/should I use spark with phoenix?
Hi Cheyenne , That's a very interesting question, if secondary indexes are created well on phoenix table , hbase will use coprocessor to do the join operation (java based map reduce job still if I understand correctly) and then return the result . on the contrary spark is famous for its great improvement vs the traditional m/r operation ,once the two tables are in spark dataframe , I believe spark wins all the time . however it might take long time to load the two big table into spark . I'll do this test in the future,right now our system is quite busy with ALS model tasks. Cheers, Dalin On Tue, Sep 13, 2016 at 3:58 PM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > i've been thinking, is spark sql faster than phoenix (or phoenix-spark) > with selects with joins on large data (for example instagram's size)? > > Regards, > > Cheyenne Forbes > > Chief Executive Officer > Avapno Omnitech > > Chief Operating Officer > Avapno Solutions, Co. > > Chairman > Avapno Assets, LLC > > Bethel Town P.O > Westmoreland > Jamaica > > Email: cheyenne.osanu.for...@gmail.com > Mobile: 876-881-7889 > skype: cheyenne.forbes1 > > > On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin wrote: > >> Hi Dalin, >> >> Thanks for the information, I'm glad to hear that the spark integration >> is working well for your use case. >> >> Josh >> >> On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin wrote: >> >>> Hi Josh, >>> >>> before the project kicked off , we get the idea that hbase is more >>> suitable for massive writing rather than batch full table reading(I forgot >>> where the idea from ,just some benchmart testing posted in the website >>> maybe). So we decide to read hbase only based on primary key for small >>> amount of data query request. we store the hbase result in json file either >>> as everyday's incremental changes(another benefit from json is you can put >>> them in a time based directory so that you could only query part of those >>> files), then use spark to read those json files and do the ML model or >>> report caculation. >>> >>> Hope this could help:) >>> >>> Dalin >>> >>> >>> On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin >>> wrote: >>> Hi Dalin, That's great to hear. Have you also tried reading back those rows through Spark for a larger "batch processing" job? Am curious if you have any experiences or insight there from operating on a large dataset. Thanks! Josh On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin wrote: > Hi , > I've used phoenix table to store billions of rows , rows are > incrementally insert into phoenix by spark every day and the table was for > instant query from web page by providing primary key . so far so good . > > Thanks > Dalin > > On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < > cheyenne.osanu.for...@gmail.com> wrote: > >> Thanks everyone, I will be using phoenix for simple input/output and >> the phoenix_spark plugin (https://phoenix.apache.org/ph >> oenix_spark.html) for more complex queries, is that the smart thing? >> >> Regards, >> >> Cheyenne Forbes >> >> Chief Executive Officer >> Avapno Omnitech >> >> Chief Operating Officer >> Avapno Solutions, Co. >> >> Chairman >> Avapno Assets, LLC >> >> Bethel Town P.O >> Westmoreland >> Jamaica >> >> Email: cheyenne.osanu.for...@gmail.com >> Mobile: 876-881-7889 >> skype: cheyenne.forbes1 >> >> >> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: >> >>> w.r.t. Resource Management, Spark also relies on other framework >>> such as YARN or Mesos. >>> >>> Cheers >>> >>> On Sun, Sep 11, 2016 at 6:31 AM, John Leach >>> wrote: >>> Spark has a robust execution model with the following features that are not part of phoenix * Scalable * fault tolerance with lineage (Handles large intermediate results) * memory management for tasks * Resource Management (Fair Scheduling) * Additional SQL Features (Windowing ,etc.) * Machine Learning Libraries Regards, John > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > > I realized there is a spark plugin for phoenix, any use cases? why would I use spark with phoenix instead of phoenix by itself? >>> >> > >>> >> >
Re: When would/should I use spark with phoenix?
i've been thinking, is spark sql faster than phoenix (or phoenix-spark) with selects with joins on large data (for example instagram's size)? Regards, Cheyenne Forbes Chief Executive Officer Avapno Omnitech Chief Operating Officer Avapno Solutions, Co. Chairman Avapno Assets, LLC Bethel Town P.O Westmoreland Jamaica Email: cheyenne.osanu.for...@gmail.com Mobile: 876-881-7889 skype: cheyenne.forbes1 On Tue, Sep 13, 2016 at 8:41 AM, Josh Mahonin wrote: > Hi Dalin, > > Thanks for the information, I'm glad to hear that the spark integration is > working well for your use case. > > Josh > > On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin wrote: > >> Hi Josh, >> >> before the project kicked off , we get the idea that hbase is more >> suitable for massive writing rather than batch full table reading(I forgot >> where the idea from ,just some benchmart testing posted in the website >> maybe). So we decide to read hbase only based on primary key for small >> amount of data query request. we store the hbase result in json file either >> as everyday's incremental changes(another benefit from json is you can put >> them in a time based directory so that you could only query part of those >> files), then use spark to read those json files and do the ML model or >> report caculation. >> >> Hope this could help:) >> >> Dalin >> >> >> On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin wrote: >> >>> Hi Dalin, >>> >>> That's great to hear. Have you also tried reading back those rows >>> through Spark for a larger "batch processing" job? Am curious if you have >>> any experiences or insight there from operating on a large dataset. >>> >>> Thanks! >>> >>> Josh >>> >>> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin wrote: >>> Hi , I've used phoenix table to store billions of rows , rows are incrementally insert into phoenix by spark every day and the table was for instant query from web page by providing primary key . so far so good . Thanks Dalin On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > Thanks everyone, I will be using phoenix for simple input/output and > the phoenix_spark plugin (https://phoenix.apache.org/ph > oenix_spark.html) for more complex queries, is that the smart thing? > > Regards, > > Cheyenne Forbes > > Chief Executive Officer > Avapno Omnitech > > Chief Operating Officer > Avapno Solutions, Co. > > Chairman > Avapno Assets, LLC > > Bethel Town P.O > Westmoreland > Jamaica > > Email: cheyenne.osanu.for...@gmail.com > Mobile: 876-881-7889 > skype: cheyenne.forbes1 > > > On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: > >> w.r.t. Resource Management, Spark also relies on other framework >> such as YARN or Mesos. >> >> Cheers >> >> On Sun, Sep 11, 2016 at 6:31 AM, John Leach >> wrote: >> >>> Spark has a robust execution model with the following features that >>> are not part of phoenix >>> * Scalable >>> * fault tolerance with lineage (Handles large intermediate >>> results) >>> * memory management for tasks >>> * Resource Management (Fair Scheduling) >>> * Additional SQL Features (Windowing ,etc.) >>> * Machine Learning Libraries >>> >>> >>> Regards, >>> John >>> >>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < >>> cheyenne.osanu.for...@gmail.com> wrote: >>> > >>> > I realized there is a spark plugin for phoenix, any use cases? why >>> would I use spark with phoenix instead of phoenix by itself? >>> >>> >> > >>> >> >
Re: When would/should I use spark with phoenix?
Hi Dalin, Thanks for the information, I'm glad to hear that the spark integration is working well for your use case. Josh On Mon, Sep 12, 2016 at 8:15 PM, dalin.qin wrote: > Hi Josh, > > before the project kicked off , we get the idea that hbase is more > suitable for massive writing rather than batch full table reading(I forgot > where the idea from ,just some benchmart testing posted in the website > maybe). So we decide to read hbase only based on primary key for small > amount of data query request. we store the hbase result in json file either > as everyday's incremental changes(another benefit from json is you can put > them in a time based directory so that you could only query part of those > files), then use spark to read those json files and do the ML model or > report caculation. > > Hope this could help:) > > Dalin > > > On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin wrote: > >> Hi Dalin, >> >> That's great to hear. Have you also tried reading back those rows through >> Spark for a larger "batch processing" job? Am curious if you have any >> experiences or insight there from operating on a large dataset. >> >> Thanks! >> >> Josh >> >> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin wrote: >> >>> Hi , >>> I've used phoenix table to store billions of rows , rows are >>> incrementally insert into phoenix by spark every day and the table was for >>> instant query from web page by providing primary key . so far so good . >>> >>> Thanks >>> Dalin >>> >>> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < >>> cheyenne.osanu.for...@gmail.com> wrote: >>> Thanks everyone, I will be using phoenix for simple input/output and the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html) for more complex queries, is that the smart thing? Regards, Cheyenne Forbes Chief Executive Officer Avapno Omnitech Chief Operating Officer Avapno Solutions, Co. Chairman Avapno Assets, LLC Bethel Town P.O Westmoreland Jamaica Email: cheyenne.osanu.for...@gmail.com Mobile: 876-881-7889 skype: cheyenne.forbes1 On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: > w.r.t. Resource Management, Spark also relies on other framework such > as YARN or Mesos. > > Cheers > > On Sun, Sep 11, 2016 at 6:31 AM, John Leach wrote: > >> Spark has a robust execution model with the following features that >> are not part of phoenix >> * Scalable >> * fault tolerance with lineage (Handles large intermediate >> results) >> * memory management for tasks >> * Resource Management (Fair Scheduling) >> * Additional SQL Features (Windowing ,etc.) >> * Machine Learning Libraries >> >> >> Regards, >> John >> >> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < >> cheyenne.osanu.for...@gmail.com> wrote: >> > >> > I realized there is a spark plugin for phoenix, any use cases? why >> would I use spark with phoenix instead of phoenix by itself? >> >> > >>> >> >
Re: When would/should I use spark with phoenix?
Hi Josh, before the project kicked off , we get the idea that hbase is more suitable for massive writing rather than batch full table reading(I forgot where the idea from ,just some benchmart testing posted in the website maybe). So we decide to read hbase only based on primary key for small amount of data query request. we store the hbase result in json file either as everyday's incremental changes(another benefit from json is you can put them in a time based directory so that you could only query part of those files), then use spark to read those json files and do the ML model or report caculation. Hope this could help:) Dalin On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin wrote: > Hi Dalin, > > That's great to hear. Have you also tried reading back those rows through > Spark for a larger "batch processing" job? Am curious if you have any > experiences or insight there from operating on a large dataset. > > Thanks! > > Josh > > On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin wrote: > >> Hi , >> I've used phoenix table to store billions of rows , rows are >> incrementally insert into phoenix by spark every day and the table was for >> instant query from web page by providing primary key . so far so good . >> >> Thanks >> Dalin >> >> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < >> cheyenne.osanu.for...@gmail.com> wrote: >> >>> Thanks everyone, I will be using phoenix for simple input/output and >>> the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html) >>> for more complex queries, is that the smart thing? >>> >>> Regards, >>> >>> Cheyenne Forbes >>> >>> Chief Executive Officer >>> Avapno Omnitech >>> >>> Chief Operating Officer >>> Avapno Solutions, Co. >>> >>> Chairman >>> Avapno Assets, LLC >>> >>> Bethel Town P.O >>> Westmoreland >>> Jamaica >>> >>> Email: cheyenne.osanu.for...@gmail.com >>> Mobile: 876-881-7889 >>> skype: cheyenne.forbes1 >>> >>> >>> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: >>> w.r.t. Resource Management, Spark also relies on other framework such as YARN or Mesos. Cheers On Sun, Sep 11, 2016 at 6:31 AM, John Leach wrote: > Spark has a robust execution model with the following features that > are not part of phoenix > * Scalable > * fault tolerance with lineage (Handles large intermediate > results) > * memory management for tasks > * Resource Management (Fair Scheduling) > * Additional SQL Features (Windowing ,etc.) > * Machine Learning Libraries > > > Regards, > John > > > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < > cheyenne.osanu.for...@gmail.com> wrote: > > > > I realized there is a spark plugin for phoenix, any use cases? why > would I use spark with phoenix instead of phoenix by itself? > > >>> >> >
Re: When would/should I use spark with phoenix?
Hi Dalin, That's great to hear. Have you also tried reading back those rows through Spark for a larger "batch processing" job? Am curious if you have any experiences or insight there from operating on a large dataset. Thanks! Josh On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin wrote: > Hi , > I've used phoenix table to store billions of rows , rows are incrementally > insert into phoenix by spark every day and the table was for instant query > from web page by providing primary key . so far so good . > > Thanks > Dalin > > On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < > cheyenne.osanu.for...@gmail.com> wrote: > >> Thanks everyone, I will be using phoenix for simple input/output and >> the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html) >> for more complex queries, is that the smart thing? >> >> Regards, >> >> Cheyenne Forbes >> >> Chief Executive Officer >> Avapno Omnitech >> >> Chief Operating Officer >> Avapno Solutions, Co. >> >> Chairman >> Avapno Assets, LLC >> >> Bethel Town P.O >> Westmoreland >> Jamaica >> >> Email: cheyenne.osanu.for...@gmail.com >> Mobile: 876-881-7889 >> skype: cheyenne.forbes1 >> >> >> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: >> >>> w.r.t. Resource Management, Spark also relies on other framework such >>> as YARN or Mesos. >>> >>> Cheers >>> >>> On Sun, Sep 11, 2016 at 6:31 AM, John Leach wrote: >>> Spark has a robust execution model with the following features that are not part of phoenix * Scalable * fault tolerance with lineage (Handles large intermediate results) * memory management for tasks * Resource Management (Fair Scheduling) * Additional SQL Features (Windowing ,etc.) * Machine Learning Libraries Regards, John > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > > I realized there is a spark plugin for phoenix, any use cases? why would I use spark with phoenix instead of phoenix by itself? >>> >> >
Re: When would/should I use spark with phoenix?
Hi , I've used phoenix table to store billions of rows , rows are incrementally insert into phoenix by spark every day and the table was for instant query from web page by providing primary key . so far so good . Thanks Dalin On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > Thanks everyone, I will be using phoenix for simple input/output and > the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html) > for more complex queries, is that the smart thing? > > Regards, > > Cheyenne Forbes > > Chief Executive Officer > Avapno Omnitech > > Chief Operating Officer > Avapno Solutions, Co. > > Chairman > Avapno Assets, LLC > > Bethel Town P.O > Westmoreland > Jamaica > > Email: cheyenne.osanu.for...@gmail.com > Mobile: 876-881-7889 > skype: cheyenne.forbes1 > > > On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: > >> w.r.t. Resource Management, Spark also relies on other framework such as >> YARN or Mesos. >> >> Cheers >> >> On Sun, Sep 11, 2016 at 6:31 AM, John Leach wrote: >> >>> Spark has a robust execution model with the following features that are >>> not part of phoenix >>> * Scalable >>> * fault tolerance with lineage (Handles large intermediate >>> results) >>> * memory management for tasks >>> * Resource Management (Fair Scheduling) >>> * Additional SQL Features (Windowing ,etc.) >>> * Machine Learning Libraries >>> >>> >>> Regards, >>> John >>> >>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < >>> cheyenne.osanu.for...@gmail.com> wrote: >>> > >>> > I realized there is a spark plugin for phoenix, any use cases? why >>> would I use spark with phoenix instead of phoenix by itself? >>> >>> >> >
Re: When would/should I use spark with phoenix?
Thanks everyone, I will be using phoenix for simple input/output and the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html) for more complex queries, is that the smart thing? Regards, Cheyenne Forbes Chief Executive Officer Avapno Omnitech Chief Operating Officer Avapno Solutions, Co. Chairman Avapno Assets, LLC Bethel Town P.O Westmoreland Jamaica Email: cheyenne.osanu.for...@gmail.com Mobile: 876-881-7889 skype: cheyenne.forbes1 On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu wrote: > w.r.t. Resource Management, Spark also relies on other framework such as > YARN or Mesos. > > Cheers > > On Sun, Sep 11, 2016 at 6:31 AM, John Leach wrote: > >> Spark has a robust execution model with the following features that are >> not part of phoenix >> * Scalable >> * fault tolerance with lineage (Handles large intermediate >> results) >> * memory management for tasks >> * Resource Management (Fair Scheduling) >> * Additional SQL Features (Windowing ,etc.) >> * Machine Learning Libraries >> >> >> Regards, >> John >> >> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < >> cheyenne.osanu.for...@gmail.com> wrote: >> > >> > I realized there is a spark plugin for phoenix, any use cases? why >> would I use spark with phoenix instead of phoenix by itself? >> >> >
Re: When would/should I use spark with phoenix?
w.r.t. Resource Management, Spark also relies on other framework such as YARN or Mesos. Cheers On Sun, Sep 11, 2016 at 6:31 AM, John Leach wrote: > Spark has a robust execution model with the following features that are > not part of phoenix > * Scalable > * fault tolerance with lineage (Handles large intermediate results) > * memory management for tasks > * Resource Management (Fair Scheduling) > * Additional SQL Features (Windowing ,etc.) > * Machine Learning Libraries > > > Regards, > John > > > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < > cheyenne.osanu.for...@gmail.com> wrote: > > > > I realized there is a spark plugin for phoenix, any use cases? why would > I use spark with phoenix instead of phoenix by itself? > >
Re: When would/should I use spark with phoenix?
Just to add to James' comment, they're indeed complementary and it all comes down to your own use case. Phoenix offers a convenient SQL interface over HBase, which is capable of doing very fast queries. If you're just doing insert / retrieval, it's unlikely that Spark will help you much there. However, if you have requirements to do some of the types of "big data processing" that Spark excels at, such as graph algorithms or machine learning, the plugin allows you to access the data in Phoenix+HBase. Good luck, Josh On Sun, Sep 11, 2016 at 11:12 AM, James Taylor wrote: > It's not an either/or with Phoenix and Spark - often companies use both as > they're very complementary. See this [1] blog for an example. Spark is a > processing engine while Phoenix+HBase is a database/store. You'll need to > store your data somewhere. > Thanks, > James > > [1] http://tech.marinsoftware.com/nosql/digital-advertising- > storage-on-apache-hbase-and-apache-phoenix/?platform=hootsuite > > > On Sunday, September 11, 2016, Cheyenne Forbes < > cheyenne.osanu.for...@gmail.com> wrote: > >> Thank you. For a project as big as Facebook or Snapschat, would you >> recommend using Spark or Phoenix for things such as message >> retrieval/insert, user search, user feeds retrieval/insert, etc. and what >> are the pros and cons? >> >> Regard, >> Cheyenne >> >> >> On Sun, Sep 11, 2016 at 8:31 AM, John Leach wrote: >> >>> Spark has a robust execution model with the following features that are >>> not part of phoenix >>> * Scalable >>> * fault tolerance with lineage (Handles large intermediate >>> results) >>> * memory management for tasks >>> * Resource Management (Fair Scheduling) >>> * Additional SQL Features (Windowing ,etc.) >>> * Machine Learning Libraries >>> >>> >>> Regards, >>> John >>> >>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < >>> cheyenne.osanu.for...@gmail.com> wrote: >>> > >>> > I realized there is a spark plugin for phoenix, any use cases? why >>> would I use spark with phoenix instead of phoenix by itself? >>> >>> >>
Re: When would/should I use spark with phoenix?
It's not an either/or with Phoenix and Spark - often companies use both as they're very complementary. See this [1] blog for an example. Spark is a processing engine while Phoenix+HBase is a database/store. You'll need to store your data somewhere. Thanks, James [1] http://tech.marinsoftware.com/nosql/digital-advertising-storage-on-apache-hbase-and-apache-phoenix/?platform=hootsuite On Sunday, September 11, 2016, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > Thank you. For a project as big as Facebook or Snapschat, would you > recommend using Spark or Phoenix for things such as message > retrieval/insert, user search, user feeds retrieval/insert, etc. and what > are the pros and cons? > > Regard, > Cheyenne > > > On Sun, Sep 11, 2016 at 8:31 AM, John Leach > wrote: > >> Spark has a robust execution model with the following features that are >> not part of phoenix >> * Scalable >> * fault tolerance with lineage (Handles large intermediate >> results) >> * memory management for tasks >> * Resource Management (Fair Scheduling) >> * Additional SQL Features (Windowing ,etc.) >> * Machine Learning Libraries >> >> >> Regards, >> John >> >> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < >> cheyenne.osanu.for...@gmail.com >> > wrote: >> > >> > I realized there is a spark plugin for phoenix, any use cases? why >> would I use spark with phoenix instead of phoenix by itself? >> >> >
Re: When would/should I use spark with phoenix?
Thank you. For a project as big as Facebook or Snapschat, would you recommend using Spark or Phoenix for things such as message retrieval/insert, user search, user feeds retrieval/insert, etc. and what are the pros and cons? Regard, Cheyenne On Sun, Sep 11, 2016 at 8:31 AM, John Leach wrote: > Spark has a robust execution model with the following features that are > not part of phoenix > * Scalable > * fault tolerance with lineage (Handles large intermediate results) > * memory management for tasks > * Resource Management (Fair Scheduling) > * Additional SQL Features (Windowing ,etc.) > * Machine Learning Libraries > > > Regards, > John > > > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes < > cheyenne.osanu.for...@gmail.com> wrote: > > > > I realized there is a spark plugin for phoenix, any use cases? why would > I use spark with phoenix instead of phoenix by itself? > >
Re: When would/should I use spark with phoenix?
Spark has a robust execution model with the following features that are not part of phoenix * Scalable * fault tolerance with lineage (Handles large intermediate results) * memory management for tasks * Resource Management (Fair Scheduling) * Additional SQL Features (Windowing ,etc.) * Machine Learning Libraries Regards, John > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes > wrote: > > I realized there is a spark plugin for phoenix, any use cases? why would I > use spark with phoenix instead of phoenix by itself?
When would/should I use spark with phoenix?
I realized there is a spark plugin for phoenix, any use cases? why would I use spark with phoenix instead of phoenix by itself?