Re: [HELP:]Save Spark Dataframe in Phoenix Table

2016-04-08 Thread Josh Mahonin
Hi Divya, That's strange. Are you able to post a snippet of your code to look at? And are you sure that you're saving the dataframes as per the docs ( https://phoenix.apache.org/phoenix_spark.html)? Depending on your HDP version, it may or may not actually have phoenix-spark support.

Re: About Spark On Hbase

2015-12-15 Thread Josh Mahonin
And as yet another option, there is https://phoenix.apache.org/phoenix_spark.html It however requires that you are also using Phoenix in conjunction with HBase. On Tue, Dec 15, 2015 at 4:16 PM, Ted Yu wrote: > There is also >

Re: Apache Phoenix (4.3.1 and 4.4.0-HBase-0.98) on Spark 1.3.1 ClassNotFoundException

2015-06-11 Thread Josh Mahonin
: http://stackoverflow.com/questions/30639659/apache-phoenix-4-3-1-and-4-4-0-hbase-0-98-on-spark-1-3-1-classnotfoundexceptio Have a great day! Cheers, Jeroen On Wednesday 10 June 2015 08:58:02 Josh Mahonin wrote: Hi Jeroen, Rather than bundle the Phoenix client JAR with your app

Re: Apache Phoenix (4.3.1 and 4.4.0-HBase-0.98) on Spark 1.3.1 ClassNotFoundException

2015-06-10 Thread Josh Mahonin
/app-20150610010512-0001/0/./metrics- core-2.2.0.jar] On Tuesday, June 09, 2015 11:18:08 AM Josh Mahonin wrote: This may or may not be helpful for your classpath issues, but I wanted to verify that basic functionality worked, so I made a sample app here: https://github.com/jmahonin/spark

Re: Apache Phoenix (4.3.1 and 4.4.0-HBase-0.98) on Spark 1.3.1 ClassNotFoundException

2015-06-09 Thread Josh Mahonin
This may or may not be helpful for your classpath issues, but I wanted to verify that basic functionality worked, so I made a sample app here: https://github.com/jmahonin/spark-streaming-phoenix This consumes events off a Kafka topic using spark streaming, and writes out event counts to Phoenix

Re: Spark SQL with Apache Phoenix lower and upper Bound

2014-11-24 Thread Josh Mahonin
, another question, I still haven't tried this out, but I'll actually be using this with PySpark, so I'm guessing the PhoenixPigConfiguration and newHadoopRDD can be defined in PySpark as well? Regards, Alaa Ali On Fri, Nov 21, 2014 at 4:34 PM, Josh Mahonin jmaho...@interset.com wrote: Hi

Re: Spark SQL with Apache Phoenix lower and upper Bound

2014-11-21 Thread Josh Mahonin
Hi Alaa Ali, In order for Spark to split the JDBC query in parallel, it expects an upper and lower bound for your input data, as well as a number of partitions so that it can split the query across multiple tasks. For example, depending on your data distribution, you could set an upper and lower

Re: Data from Mysql using JdbcRDD

2014-07-30 Thread Josh Mahonin
Hi Srini, I believe the JdbcRDD requires input splits based on ranges within the query itself. As an example, you could adjust your query to something like: SELECT * FROM student_info WHERE id = ? AND id = ? Note that the values you've passed in '1, 20, 2' correspond to the lower bound index,

Re: Spark and HBase

2014-04-25 Thread Josh Mahonin
Phoenix generally presents itself as an endpoint using JDBC, which in my testing seems to play nicely using JdbcRDD. However, a few days ago a patch was made against Phoenix to implement support via PIG using a custom Hadoop InputFormat, which means now it has Spark support too. Here's a code