Hi,
I am trying to fetch data from Oracle DB using a subquery and experiencing
lot of performance issues.
Below is the query I am using,
*Using Spark 2.0.2*
*val *df = spark_session.read.format(*"jdbc"*)
.option(*"driver"*,*"*oracle.jdbc.OracleDriver*"*)
.option(*"url"*, jdbc_url)
st to use Flume, if possible, as it has in built HDFS log
> rolling capabilities
>
> On Mon, Jun 26, 2017 at 1:09 PM, Naveen Madhire <vmadh...@umail.iu.edu>
> wrote:
>
>> Hi,
>>
>> I am using spark streaming with 1 minute duration to read data from kafka
>&
Hi,
I am using spark streaming with 1 minute duration to read data from kafka
topic, apply transformations and persist into HDFS.
The application is creating a new directory every 1 minute with many
partition files(= nbr of partitions). What parameter should I need to
change/configure to persist
Hi All,
I am running the WikiPedia parsing example present in the Advance
Analytics with Spark book.
https://github.com/sryza/aas/blob/d3f62ef3ed43a59140f4ae8afbe2ef81fc643ef2/ch06-lsa/src/main/scala/com/cloudera/datascience/lsa/ParseWikipedia.scala#l112
The partitions of the RDD returned by
Hi,
I am running pyspark in windows and I am seeing an error while adding
pyfiles to the sparkcontext. below is the example,
sc = SparkContext(local,Sample,pyFiles=C:/sample/yattag.zip)
this fails with no file found error for C
The below logic is treating the path as individual files like C,
I had the similar issue with spark 1.3
After migrating to Spark 1.4 and using sqlcontext.read.json it worked well
I think you can look at dataframe select and explode options to read the
nested json elements, array etc.
Thanks.
On Mon, Jul 20, 2015 at 11:07 AM, Davies Liu dav...@databricks.com
I am facing the same issue, i tried this but getting compilation error for
the $ in the explode function
So, I had to modify to the below to make it work.
df.select(explode(new Column(entities.user_mentions)).as(mention))
On Wed, Jun 24, 2015 at 2:48 PM, Michael Armbrust
Yes. I did this recently. You need to copy the cloudera cluster related
conf files into the local machine
and set HADOOP_CONF_DIR or YARN_CONF_DIR.
And also local machine should be able to ssh to the cloudera cluster.
On Wed, Jul 15, 2015 at 8:51 AM, ayan guha guha.a...@gmail.com wrote:
use spark-testing-base from
spark-packages.org as a basis for your unittests.
On Fri, Jul 10, 2015 at 12:03 PM, Daniel Siegmann
daniel.siegm...@teamaol.com wrote:
On Fri, Jul 10, 2015 at 1:41 PM, Naveen Madhire vmadh...@umail.iu.edu
wrote:
I want to write junit test cases in scala
Hi,
I want to write junit test cases in scala for testing spark application. Is
there any guide or link which I can refer.
Thank you very much.
-Naveen
Hi All,
I am working with dataframes and have been struggling with this thing, any
pointers would be helpful.
I've a Json file with the schema like this,
links: array (nullable = true)
||-- element: struct (containsNull = true)
|||-- desc: string (nullable = true)
|||--
Hi Marcelo, Quick Question.
I am using Spark 1.3 and using Yarn Client mode. It is working well,
provided I have to manually pip-install all the 3rd party libraries like
numpy etc to the executor nodes.
So the SPARK-5479 fix in 1.5 which you mentioned fix this as well?
Thanks.
On Thu, Jun
Cloudera blog has some details.
Please check if this is helpful to you.
http://blog.cloudera.com/blog/2014/12/new-in-cloudera-labs-sparkonhbase/
Thanks.
On Wed, May 20, 2015 at 4:21 AM, donhoff_h 165612...@qq.com wrote:
Hi, all
I wrote a program to get HBaseConfiguration object in Spark.
Hi All,
I am trying to run a sample Spark program using Scala SBT,
Below is the program,
def main(args: Array[String]) {
val logFile = E:/ApacheSpark/usb/usb/spark/bin/README.md // Should
be some file on your system
val sc = new SparkContext(local, Simple App,
.
Lines with a: 24, Lines with b: 15
The exception seems to be happening with Spark cleanup after executing
your code. Try adding sc.stop() at the end of your program to see if the
exception goes away.
On Wednesday, December 31, 2014 6:40 AM, Naveen Madhire
vmadh...@umail.iu.edu wrote
15 matches
Mail list logo