Re: NullPointerException in SparkSession while reading Parquet files on S3

2021-05-25 Thread YEONWOO BAEK
unsubscribe 2021년 5월 26일 (수) 오전 12:31, Eric Beabes 님이 작성: > I keep getting the following exception when I am trying to read a Parquet > file from a Path on S3 in Spark/Scala. Note: I am running this on EMR. > > java.lang.NullPointerException > at >

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Junfeng Chen
Hi, I know it, but my purpose it to transforming json string in DataSet to Dataset, while spark.readStream can only support read json file in specified path. https://stackoverflow.com/questions/48617474/how-to-convert-json-dataset-to-dataframe-in-spark-structured-streaming gives an essential

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Tathagata Das
Have you read through the documentation of Structured Streaming? https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html One of the basic mistakes you are making is defining the dataset as with `spark.read()`. You define a streaming Dataset as `spark.readStream()` On

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Junfeng Chen
Hi, Tathagata I have tried structured streaming, but in line > Dataset rowDataset = spark.read().json(jsondataset); Always throw > Queries with streaming sources must be executed with writeStream.start() But what i need to do in this step is only transforming json string data to Dataset .

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Tathagata Das
It's not very surprising that doing this sort of RDD to DF conversion inside DStream.foreachRDD has weird corner cases like this. In fact, you are going to have additional problems with partial parquet files (when there are failures) in this approach. I strongly suggest that you use Structured

Re: NullPointerException while reading a column from the row

2017-12-19 Thread Vadim Semenov
getAs defined as: def getAs[T](i: Int): T = get(i).asInstanceOf[T] and when you do toString you call Object.toString which doesn't depend on the type, so asInstanceOf[T] get dropped by the compiler, i.e. row.getAs[Int](0).toString -> row.get(0).toString we can confirm that by writing a simple

Re: NullPointerException error while saving Scala Dataframe to HBase

2017-10-01 Thread Marco Mistroni
Hi The question is getting to the list. I have no experience in hbase ...though , having seen similar stuff when saving a df somewhere else...it might have to do with the properties you need to set to let spark know it is dealing with hbase? Don't u need to set some properties on the spark

Re: NullPointerException error while saving Scala Dataframe to HBase

2017-09-30 Thread mailfordebu
Hi Guys- am not sure whether the email is reaching to the community members. Please can somebody acknowledge Sent from my iPhone > On 30-Sep-2017, at 5:02 PM, Debabrata Ghosh wrote: > > Dear All, >Greetings ! I am repeatedly hitting a

Re: NullPointerException when starting StreamingContext

2016-06-24 Thread Sunita Arvind
I was able to resolve the serialization issue. The root cause was, I was accessing the config values within foreachRDD{}. The solution was to extract the values from config outside the foreachRDD scope and send in values to the loop directly. Probably something obvious as we cannot have nested

Re: NullPointerException when starting StreamingContext

2016-06-24 Thread Cody Koeninger
That looks like a classpath problem. You should not have to include the kafka_2.10 artifact in your pom, spark-streaming-kafka_2.10 already has a transitive dependency on it. That being said, 0.8.2.1 is the correct version, so that's a little strange. How are you building and submitting your

Re: NullPointerException when starting StreamingContext

2016-06-23 Thread Sunita Arvind
Also, just to keep it simple, I am trying to use 1.6.0CDH5.7.0 in the pom.xml as the cluster I am trying to run on is CDH5.7.0 with spark 1.6.0. Here is my pom setting: 1.6.0-cdh5.7.0 org.apache.spark spark-core_2.10 ${cdh.spark.version} compile org.apache.spark

Re: NullPointerException when starting StreamingContext

2016-06-22 Thread Ted Yu
Which Scala version / Spark release are you using ? Cheers On Wed, Jun 22, 2016 at 8:20 PM, Sunita Arvind wrote: > Hello Experts, > > I am getting this error repeatedly: > > 16/06/23 03:06:59 ERROR streaming.StreamingContext: Error starting the > context, marking it as

Re: NullPointerException

2016-03-12 Thread saurabh guru
I don't see how that would be possible. I am reading from a live stream of data through kafka. On Sat 12 Mar, 2016 20:28 Ted Yu, wrote: > Interesting. > If kv._1 was null, shouldn't the NPE have come from getPartition() (line > 105) ? > > Was it possible that records.next()

Re: NullPointerException

2016-03-12 Thread Ted Yu
Interesting. If kv._1 was null, shouldn't the NPE have come from getPartition() (line 105) ? Was it possible that records.next() returned null ? On Fri, Mar 11, 2016 at 11:20 PM, Prabhu Joseph wrote: > Looking at ExternalSorter.scala line 192, i suspect some input

Re: NullPointerException

2016-03-11 Thread Saurabh Guru
I am using the following versions: org.apache.spark spark-streaming_2.10 1.6.0 org.apache.spark spark-streaming-kafka_2.10

Re: NullPointerException

2016-03-11 Thread Ted Yu
Which Spark release do you use ? I wonder if the following may have fixed the problem: SPARK-8029 Robust shuffle writer JIRA is down, cannot check now. On Fri, Mar 11, 2016 at 11:01 PM, Saurabh Guru wrote: > I am seeing the following exception in my Spark Cluster every

Re: NullPointerException

2016-03-11 Thread Prabhu Joseph
Looking at ExternalSorter.scala line 192, i suspect some input record has Null key. 189 while (records.hasNext) { 190addElementsRead() 191kv = records.next() 192map.changeValue((getPartition(kv._1), kv._1), update) On Sat, Mar 12, 2016 at 12:48 PM, Prabhu Joseph

Re: NullPointerException

2016-03-11 Thread Prabhu Joseph
Looking at ExternalSorter.scala line 192 189 while (records.hasNext) { addElementsRead() kv = records.next() map.changeValue((getPartition(kv._1), kv._1), update) maybeSpillCollection(usingMap = true) } On Sat, Mar 12, 2016 at 12:31 PM, Saurabh Guru wrote: > I am seeing

Re: NullPointerException with joda time

2015-11-12 Thread Romain Sagean
I Still can't make the logger work inside a map function. I can use "logInfo("")" in the main but not in the function. Anyway I rewrite my program to use java.util.Date instead joda time and I don't have NPE anymore. I will stick with this solution for the moment even if I find java Date ugly.

Re: NullPointerException with joda time

2015-11-12 Thread Ted Yu
Even if log4j didn't work, you can still get some clue by wrapping the following call with try block: currentDate = currentDate.plusDays(1) catching NPE and rethrowing with an exception that shows the value of currentDate Cheers On Thu, Nov 12, 2015 at 1:56 AM, Romain Sagean

Re: NullPointerException with joda time

2015-11-12 Thread Koert Kuipers
i remember us having issues with joda classes not serializing property and coming out null "on the other side" in tasks On Thu, Nov 12, 2015 at 10:12 AM, Ted Yu wrote: > Even if log4j didn't work, you can still get some clue by wrapping the > following call with try block:

Re: NullPointerException with joda time

2015-11-11 Thread Ted Yu
In case you need to adjust log4j properties, see the following thread: http://search-hadoop.com/m/q3RTtJHkzb1t0J66=Re+Spark+Streaming+Log4j+Inside+Eclipse Cheers On Tue, Nov 10, 2015 at 1:28 PM, Ted Yu wrote: > I took a look at >

Re: NullPointerException with joda time

2015-11-10 Thread Ted Yu
Can you show the stack trace for the NPE ? Which release of Spark are you using ? Cheers On Tue, Nov 10, 2015 at 8:20 AM, romain sagean wrote: > Hi community, > I try to apply the function below during a flatMapValues or a map but I > get a nullPointerException with the

Re: NullPointerException with joda time

2015-11-10 Thread Romain Sagean
see below a more complete version of the code. the firstDate (previously minDate) should not be null, I even added an extra "filter( _._2 != null)" before the flatMap and the error is still there. What I don't understand is why I have the error on dateSeq.las.plusDays and not on

Re: NullPointerException with joda time

2015-11-10 Thread Ted Yu
I took a look at https://github.com/JodaOrg/joda-time/blob/master/src/main/java/org/joda/time/DateTime.java Looks like the NPE came from line below: long instant = getChronology().days().add(getMillis(), days); Maybe catch the NPE and print out the value of currentDate to see if there is more

Re: NullPointerException when cache DataFrame in Java (Spark1.5.1)

2015-10-29 Thread Romi Kuntsman
Did you try to cache a DataFrame with just a single row? Do you rows have any columns with null values? Can you post a code snippet here on how you load/generate the dataframe? Does dataframe.rdd.cache work? *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Thu, Oct 29, 2015 at 4:33

Re: NullPointerException when cache DataFrame in Java (Spark1.5.1)

2015-10-29 Thread Zhang, Jingyu
Thanks Romi, I resize the dataset to 7MB, however, the code show NullPointerException exception as well. Did you try to cache a DataFrame with just a single row? Yes, I tried. But, Same problem. . Do you rows have any columns with null values? No, I had filter out null values before cache the

Re: NullPointerException when cache DataFrame in Java (Spark1.5.1)

2015-10-29 Thread Romi Kuntsman
> > BUT, after change limit(500) to limit(1000). The code report > NullPointerException. > I had a similar situation, and the problem was with a certain record. Try to find which records are returned when you limit to 1000 but not returned when you limit to 500. Could it be a NPE thrown from

Re: NullPointerException inside RDD when calling sc.textFile

2015-07-23 Thread Akhil Das
Did you try: val data = indexed_files.groupByKey val *modified_data* = data.map { a = var name = a._2.mkString(,) (a._1, name) } *modified_data*.foreach { a = var file = sc.textFile(a._2) println(file.count) } Thanks Best Regards On Wed, Jul 22, 2015 at 2:18 AM, MorEru

Re: NullPointerException with functions.rand()

2015-06-12 Thread Ted Yu
Created PR and verified the example given by Justin works with the change: https://github.com/apache/spark/pull/6793 Cheers On Wed, Jun 10, 2015 at 7:15 PM, Ted Yu yuzhih...@gmail.com wrote: Looks like the NPE came from this line: @transient protected lazy val rng = new XORShiftRandom(seed

Re: NullPointerException SQLConf.setConf

2015-06-07 Thread Cheng Lian
Are you calling hiveContext.sql within an RDD.map closure or something similar? In this way, the call actually happens on executor side. However, HiveContext only exists on the driver side. Cheng On 6/4/15 3:45 PM, patcharee wrote: Hi, I am using Hive 0.14 and spark 0.13. I got

Re: NullPointerException in ApplicationMaster

2015-02-25 Thread Zhan Zhang
Look at the trace again. It is a very weird error. The SparkSubmit is running on client side, but YarnClusterSchedulerBackend is supposed in running in YARN AM. I suspect you are running the cluster with yarn-client mode, but in JavaSparkContext you set yarn-cluster”. As a result, spark

Re: NullPointerException

2014-12-31 Thread Josh Rosen
Which version of Spark are you using? On Wed, Dec 31, 2014 at 10:24 PM, rapelly kartheek kartheek.m...@gmail.com wrote: Hi, I get this following Exception when I submit spark application that calculates the frequency of characters in a file. Especially, when I increase the size of data, I

Re: NullPointerException

2014-12-31 Thread rapelly kartheek
spark-1.0.0 On Thu, Jan 1, 2015 at 12:04 PM, Josh Rosen rosenvi...@gmail.com wrote: Which version of Spark are you using? On Wed, Dec 31, 2014 at 10:24 PM, rapelly kartheek kartheek.m...@gmail.com wrote: Hi, I get this following Exception when I submit spark application that calculates

Re: NullPointerException

2014-12-31 Thread Josh Rosen
It looks like 'null' might be selected as a block replication peer? https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L786 I know that we fixed some replication bugs in newer versions of Spark (such as

Re: NullPointerException

2014-12-31 Thread rapelly kartheek
Ok. Let me try out on a newer version. Thank you!! On Thu, Jan 1, 2015 at 12:17 PM, Josh Rosen rosenvi...@gmail.com wrote: It looks like 'null' might be selected as a block replication peer?

Re: NullPointerException on cluster mode when using foreachPartition

2014-12-16 Thread Shixiong Zhu
Could you post the stack trace? Best Regards, Shixiong Zhu 2014-12-16 23:21 GMT+08:00 richiesgr richie...@gmail.com: Hi This time I need expert. On 1.1.1 and only in cluster (standalone or EC2) when I use this code : countersPublishers.foreachRDD(rdd = {

Re: NullPointerException When Reading Avro Sequence Files

2014-12-15 Thread Simone Franzini
, Cristóvão José Domingues Cordeiro IT Department - 28/R-018 CERN -- *From:* Simone Franzini [captainfr...@gmail.com] *Sent:* 15 December 2014 16:52 *To:* Cristovao Jose Domingues Cordeiro *Subject:* Re: NullPointerException When Reading Avro Sequence Files Ok, I

Re: NullPointerException When Reading Avro Sequence Files

2014-12-09 Thread Simone Franzini
-- *From:* Simone Franzini [captainfr...@gmail.com] *Sent:* 06 December 2014 15:48 *To:* Cristovao Jose Domingues Cordeiro *Subject:* Re: NullPointerException When Reading Avro Sequence Files java.lang.IncompatibleClassChangeError: Found interface

Re: NullPointerException When Reading Avro Sequence Files

2014-12-09 Thread Simone Franzini
*Subject:* Re: NullPointerException When Reading Avro Sequence Files Hi Cristovao, I have seen a very similar issue that I have posted about in this thread: http://apache-spark-user-list.1001560.n3.nabble.com/Kryo-NPE-with-Array-td19797.html I think your main issue here is somewhat

Re: NullPointerException When Reading Avro Sequence Files

2014-12-05 Thread cjdc
Hi all, I've tried the above example on Gist, but it doesn't work (at least for me). Did anyone get this: 14/12/05 10:44:40 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class

Re: NullPointerException on reading checkpoint files

2014-09-23 Thread RodrigoB
Hi TD, This is actually an important requirement (recovery of shared variables) for us as we need to spread some referential data across the Spark nodes on application startup. I just bumped into this issue on Spark version 1.0.1. I assume the latest one also doesn't include this capability. Are

Re: NullPointerException on reading checkpoint files

2014-09-23 Thread Tathagata Das
This is actually a very tricky as their two pretty big challenges that need to be solved. (i) Checkpointing for broadcast variables: Unlike RDDs, broadcasts variable dont have checkpointing support (that is you cannot write the content of a broadcast variable to HDFS and recover it automatically

Re: NullPointerException from '.count.foreachRDD'

2014-08-20 Thread anoldbrain
Looking at the source codes of DStream.scala /** * Return a new DStream in which each RDD has a single element generated by counting each RDD * of this DStream. */ def count(): DStream[Long] = { this.map(_ = (null, 1L))

RE: NullPointerException from '.count.foreachRDD'

2014-08-20 Thread Shao, Saisai
Message- From: anoldbrain [mailto:anoldbr...@gmail.com] Sent: Wednesday, August 20, 2014 4:13 PM To: u...@spark.incubator.apache.org Subject: Re: NullPointerException from '.count.foreachRDD' Looking at the source codes of DStream.scala /** * Return a new DStream in which each RDD has

RE: NullPointerException from '.count.foreachRDD'

2014-08-20 Thread anoldbrain
Thank you for the reply. I implemented my InputDStream to return None when there's no data. After changing it to return empty RDD, the exception is gone. I am curious as to why all other processings worked correctly with my old incorrect implementation, with or without data? My actual codes,

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-19 Thread Yin Huai
Seems https://issues.apache.org/jira/browse/SPARK-2846 is the jira tracking this issue. On Mon, Aug 18, 2014 at 6:26 PM, cesararevalo ce...@zephyrhealthinc.com wrote: Thanks, Zhan for the follow up. But, do you know how I am supposed to set that table name on the jobConf? I don't have

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-19 Thread Cesar Arevalo
Thanks! Yeah, it may be related to that. I'll check out that pull request that was sent and hopefully that fixes the issue. I'll let you know, after fighting with this issue yesterday I had decided to just leave it on the side and return to it after, so it may take me a while to get back to you.

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Akhil Das
Looks like your hiveContext is null. Have a look at this documentation. https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables Thanks Best Regards On Mon, Aug 18, 2014 at 12:09 PM, Cesar Arevalo ce...@zephyrhealthinc.com wrote: Hello: I am trying to setup Spark to

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Cesar Arevalo
Nope, it is NOT null. Check this out: scala hiveContext == null res2: Boolean = false And thanks for sending that link, but I had already looked at it. Any other ideas? I looked through some of the relevant Spark Hive code and I'm starting to think this may be a bug. -Cesar On Mon, Aug 18,

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Akhil Das
Then definitely its a jar conflict. Can you try removing this jar from the class path /opt/spark-poc/lib_managed/jars/org.spark-project.hive/hive-exec/ hive-exec-0.12.0.jar Thanks Best Regards On Mon, Aug 18, 2014 at 12:45 PM, Cesar Arevalo ce...@zephyrhealthinc.com wrote: Nope, it is NOT

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Cesar Arevalo
I removed the JAR that you suggested but now I get another error when I try to create the HiveContext. Here is the error: scala val hiveContext = new HiveContext(sc) error: bad symbolic reference. A signature in HiveContext.class refers to term ql in package org.apache.hadoop.hive which is not

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread Zhan Zhang
Looks like hbaseTableName is null, probably caused by incorrect configuration. String hbaseTableName = jobConf.get(HBaseSerDe.HBASE_TABLE_NAME); setHTable(new HTable(HBaseConfiguration.create(jobConf), Bytes.toBytes(hbaseTableName))); Here is the definition. public static final

Re: NullPointerException when connecting from Spark to a Hive table backed by HBase

2014-08-18 Thread cesararevalo
Thanks, Zhan for the follow up. But, do you know how I am supposed to set that table name on the jobConf? I don't have access to that object from my client driver? -- View this message in context:

Re: NullPointerException When Reading Avro Sequence Files

2014-07-21 Thread Sparky
For those curious I used the JavaSparkContext and got access to an AvroSequenceFile (wrapper around Sequence File) using the following: file = sc.newAPIHadoopFile(hdfs path to my file, AvroSequenceFileInputFormat.class, AvroKey.class, AvroValue.class, new Configuration()) -- View this

Re: NullPointerException When Reading Avro Sequence Files

2014-07-19 Thread Sparky
I see Spark is using AvroRecordReaderBase, which is used to grab Avro Container Files, which is different from Sequence Files. If anyone is using Avro Sequence Files with success and has an example, please let me know. -- View this message in context:

Re: NullPointerException When Reading Avro Sequence Files

2014-07-19 Thread Sparky
To be more specific, I'm working with a system that stores data in org.apache.avro.hadoop.io.AvroSequenceFile format. An AvroSequenceFile is A wrapper around a Hadoop SequenceFile that also supports reading and writing Avro data. It seems that Spark does not support this out of the box. --

Re: NullPointerException When Reading Avro Sequence Files

2014-07-19 Thread Nick Pentreath
I got this working locally a little while ago when playing around with AvroKeyInputFile: https://gist.github.com/MLnick/5864741781b9340cb211 But not sure about AvroSequenceFile. Any chance you have an example datafile or records? On Sat, Jul 19, 2014 at 11:00 AM, Sparky gullo_tho...@bah.com

Re: NullPointerException When Reading Avro Sequence Files

2014-07-19 Thread Sparky
Thanks for the gist. I'm just now learning about Avro. I think when you use a DataFileWriter you are writing to an Avro Container (which is different than an Avro Sequence File). I have a system where data was written to an HDFS Sequence File using AvroSequenceFile.Writer (which is a wrapper

Re: NullPointerException When Reading Avro Sequence Files

2014-07-18 Thread aaronjosephs
I think you probably want to use `AvroSequenceFileOutputFormat` with `newAPIHadoopFile`. I'm not even sure that in hadoop you would use SequenceFileInput format to read an avro sequence file -- View this message in context:

Re: NullPointerException When Reading Avro Sequence Files

2014-07-18 Thread Sparky
Thanks for responding. I tried using the newAPIHadoopFile method and got an IO Exception with the message Not a data file. If anyone has an example of this working I'd appreciate your input or examples. What I entered at the repl and what I got back are below: val myAvroSequenceFile =

Re: NullPointerException on ExternalAppendOnlyMap

2014-07-02 Thread Andrew Or
Hi Konstantin, Thanks for reporting this. This happens because there are null keys in your data. In general, Spark should not throw null pointer exceptions, so this is a bug. I have fixed this here: https://github.com/apache/spark/pull/1288. For now, you can workaround this by special-handling

Re: NullPointerException on reading checkpoint files

2014-06-12 Thread Kiran
I am also seeing similar problem when trying to continue job using saved checkpoint. Can somebody help in solving this problem? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-on-reading-checkpoint-files-tp7306p7507.html Sent from the