Hi Divya,
That's strange. Are you able to post a snippet of your code to look at? And
are you sure that you're saving the dataframes as per the docs (
https://phoenix.apache.org/phoenix_spark.html)?
Depending on your HDP version, it may or may not actually have
phoenix-spark support.
And as yet another option, there is
https://phoenix.apache.org/phoenix_spark.html
It however requires that you are also using Phoenix in conjunction with
HBase.
On Tue, Dec 15, 2015 at 4:16 PM, Ted Yu wrote:
> There is also
>
:
http://stackoverflow.com/questions/30639659/apache-phoenix-4-3-1-and-4-4-0-hbase-0-98-on-spark-1-3-1-classnotfoundexceptio
Have a great day!
Cheers,
Jeroen
On Wednesday 10 June 2015 08:58:02 Josh Mahonin wrote:
Hi Jeroen,
Rather than bundle the Phoenix client JAR with your app
/app-20150610010512-0001/0/./metrics-
core-2.2.0.jar]
On Tuesday, June 09, 2015 11:18:08 AM Josh Mahonin wrote:
This may or may not be helpful for your classpath issues, but I wanted to
verify that basic functionality worked, so I made a sample app here:
https://github.com/jmahonin/spark
This may or may not be helpful for your classpath issues, but I wanted to
verify that basic functionality worked, so I made a sample app here:
https://github.com/jmahonin/spark-streaming-phoenix
This consumes events off a Kafka topic using spark streaming, and writes
out event counts to Phoenix
,
another question, I still haven't tried this out, but I'll actually be
using this with PySpark, so I'm guessing the PhoenixPigConfiguration and
newHadoopRDD can be defined in PySpark as well?
Regards,
Alaa Ali
On Fri, Nov 21, 2014 at 4:34 PM, Josh Mahonin jmaho...@interset.com
wrote:
Hi
Hi Alaa Ali,
In order for Spark to split the JDBC query in parallel, it expects an upper
and lower bound for your input data, as well as a number of partitions so
that it can split the query across multiple tasks.
For example, depending on your data distribution, you could set an upper
and lower
Hi Srini,
I believe the JdbcRDD requires input splits based on ranges within the
query itself. As an example, you could adjust your query to something like:
SELECT * FROM student_info WHERE id = ? AND id = ?
Note that the values you've passed in '1, 20, 2' correspond to the lower
bound index,
Phoenix generally presents itself as an endpoint using JDBC, which in my
testing seems to play nicely using JdbcRDD.
However, a few days ago a patch was made against Phoenix to implement
support via PIG using a custom Hadoop InputFormat, which means now it has
Spark support too.
Here's a code