Untraceable NullPointerException

2022-12-09 Thread Alberto Huélamo
Hello, I have a job that runs on Databricks Runtime 11.3 LTS, so Spark 3.3.0, which is causing a NullPointerException and the stack trace does not include any reference to the job's code. Only Spark internals. Which makes me wonder whether we're dealing with a Spark bug or there&#

Re: NullPointerException in SparkSession while reading Parquet files on S3

2021-05-25 Thread YEONWOO BAEK
unsubscribe 2021년 5월 26일 (수) 오전 12:31, Eric Beabes 님이 작성: > I keep getting the following exception when I am trying to read a Parquet > file from a Path on S3 in Spark/Scala. Note: I am running this on EMR. > > java.lang.NullPointerException > at > org.apache.spark.sql.SparkSession.sessi

NullPointerException in SparkSession while reading Parquet files on S3

2021-05-25 Thread Eric Beabes
I keep getting the following exception when I am trying to read a Parquet file from a Path on S3 in Spark/Scala. Note: I am running this on EMR. java.lang.NullPointerException at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:144) at org.apache.spark

Re: [Structured Streaming] NullPointerException in long running query

2020-04-29 Thread ZHANG Wei
Is there any chance we also print the least recent failure in stage as the following most recent failure before Driver statcktrace? > >> Caused by: org.apache.spark.SparkException: Job aborted due to stage > >> failure: Task 10 in stage 1.0 failed 4 times, most recent failure: Lost > >> task 10

Re: [Structured Streaming] NullPointerException in long running query

2020-04-28 Thread Shixiong(Ryan) Zhu
The stack trace is omitted by JVM when an exception is thrown too many times. This usually happens when you have multiple Spark tasks on the same executor JVM throwing the same exception. See https://stackoverflow.com/a/3010106 Best Regards, Ryan On Tue, Apr 28, 2020 at 10:45 PM lec ssmi wrote:

Re: [Structured Streaming] NullPointerException in long running query

2020-04-28 Thread lec ssmi
It should be a problem of my data quality. It's curious why the driver-side exception stack has no specific exception information. Edgardo Szrajber 于2020年4月28日周二 下午3:32写道: > The exception occured while aborting the stage. It might be interesting to > try to understand the reason for the abortion

Re: [Structured Streaming] NullPointerException in long running query

2020-04-28 Thread Edgardo Szrajber
The exception occured while aborting the stage. It might be interesting to try to understand the reason for the abortion.Maybe timeout? How long the query run?Bentzi Sent from Yahoo Mail on Android On Tue, Apr 28, 2020 at 9:25, Jungtaek Lim wrote: The root cause of exception is occurred

Re: [Structured Streaming] NullPointerException in long running query

2020-04-27 Thread Jungtaek Lim
The root cause of exception is occurred in executor side "Lost task 10.3 in stage 1.0 (TID 81, spark6, executor 1)" so you may need to check there. On Tue, Apr 28, 2020 at 2:52 PM lec ssmi wrote: > Hi: > One of my long-running queries occasionally encountered the following > exception: > > >

[Structured Streaming] NullPointerException in long running query

2020-04-27 Thread lec ssmi
Hi: One of my long-running queries occasionally encountered the following exception: Caused by: org.apache.spark.SparkException: Job aborted due to stage > failure: Task 10 in stage 1.0 failed 4 times, most recent failure: Lost > task 10.3 in stage 1.0 (TID 81, spark6, executor 1): > java.lan

NullPointerException at FileBasedWriteAheadLogRandomReader

2019-12-27 Thread Kang Minwoo
Hello, Users. While I use a write-ahead logs in spark streaming, I got an error that is a NullPointerException at FileBasedWriteAheadLogRandomReader.scala:48[1] [1]: https://github.com/apache/spark/blob/v2.4.4/streaming/src/main/scala/org/apache/spark/streaming/util

NullPointerException when scanning HBase table

2018-04-30 Thread Huiliang Zhang
Hi, In my spark job, I need to scan HBase table. I set up a scan with custom filters. Then I use newAPIHadoopRDD function to get a JavaPairRDD variable X. The problem is when no records inside HBase matches my filters, the call X.isEmpty() or X.count() will cause a java.lang.NullPointerException

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Junfeng Chen
/ convert >>>>> json to df >>>>> JavaRDD rowJavaRDD = df.javaRDD().map... //add some new >>>>> fields >>>>> StructType type = df.schema()...; // constuct new type for new >>>>> added fields >>>>> Dat

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Tathagata Das
// constuct new type for new >>>> added fields >>>> Dataset>>> //create new dataframe >>>> newdf.repatition(taskNum).write().mode(SaveMode.Append).pati >>>> tionedBy("appname").parquet(savepath); // save to parquet >

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Junfeng Chen
type = df.schema()...; // constuct new type for new added >>> fields >>> Dataset>> //create new dataframe >>> newdf.repatition(taskNum).write().mode(SaveMode.Append).pati >>> tionedBy("appname").parquet(savepath);

Re: Nullpointerexception error when in repartition

2018-04-12 Thread Tathagata Das
ionedBy("appname").parquet(savepath); // save to parquet >> }) > > > > However, if I remove the repartition method of newdf in writing parquet > stage, the program always throw nullpointerexception error in json convert > line: > > Java.lang.NullPointerExceptio

Nullpointerexception error when in repartition

2018-04-11 Thread Junfeng Chen
(SaveMode.Append).patitionedBy("appname").parquet(savepath); > // save to parquet > }) However, if I remove the repartition method of newdf in writing parquet stage, the program always throw nullpointerexception error in json convert line: Java.lang.NullPointerExcepti

NullPointerException issue in LDA.train()

2018-02-09 Thread Kevin Lam
Hi, We're encountering an issue with training an LDA model in PySpark. The issue is as follows: - Running LDA on some large set of documents (12M, ~2-5kB each) - Works fine for small subset of full set (100K - 1M) - Hit a NullPointerException for full data set - Running workload on google

Re: [Spark DataFrame]: Passing DataFrame to custom method results in NullPointerException

2018-01-22 Thread Matteo Cossu
ely the > above map only returns a TraversableLike collection so I can’t do > transformations and joins on this data structure so I’m tried to apply a > filter on the rdd with the following code: > > .filter(line => validate_hostname(line, data_frame)).count() > > > >

[Spark DataFrame]: Passing DataFrame to custom method results in NullPointerException

2018-01-15 Thread abdul.h.hussain
te_hostname(line, data_frame)).count() Unfortunately the above method with filtering the rdd does not pass the data_frame so I get a NullPointerException though it correctly passes the case class which I print within the method. Where am I going wrong? When Regards, Abdul Haseeb Hussain

Re: NullPointerException while reading a column from the row

2017-12-19 Thread Vadim Semenov
*Test.this.row().getAs(0).toString();* () } } } So the proper way would be: String.valueOf(row.getAs[Int](0)) On Tue, Dec 19, 2017 at 4:23 AM, Anurag Sharma wrote: > The following Scala (Spark 1.6) code for reading a value from a Row fails > with a NullPointerExceptio

NullPointerException while reading a column from the row

2017-12-19 Thread Anurag Sharma
The following Scala (Spark 1.6) code for reading a value from a Row fails with a NullPointerException when the value is null. val test = row.getAs[Int]("ColumnName").toString while this works fine val test1 = row.getAs[Int]("ColumnName") // returns 0 for nullval te

[Spark SQL]: DataFrame schema resulting in NullPointerException

2017-11-19 Thread Chitral Verma
("name", "country") df.rdd .map(x => x.toSeq) .map(x => new GenericRowWithSchema(x.toArray, df.schema)) .foreach(println) } } This results in NullPointerException as I'm directly using df.schema in map(). What I don't understand is that if I use

Re: NullPointerException error while saving Scala Dataframe to HBase

2017-10-01 Thread Marco Mistroni
Greetings ! I am repeatedly hitting a NullPointerException error while saving a Scala Dataframe to HBase. Please can you help resolving this for me. Here is the code snippet: > > scala> def catalog = s"""{ > ||"table":{"namespace&q

Re: NullPointerException error while saving Scala Dataframe to HBase

2017-09-30 Thread mailfordebu
Hi Guys- am not sure whether the email is reaching to the community members. Please can somebody acknowledge Sent from my iPhone > On 30-Sep-2017, at 5:02 PM, Debabrata Ghosh wrote: > > Dear All, >Greetings ! I am repeatedly hitting a NullPointerException &

NullPointerException error while saving Scala Dataframe to HBase

2017-09-30 Thread Debabrata Ghosh
Dear All, Greetings ! I am repeatedly hitting a NullPointerException error while saving a Scala Dataframe to HBase. Please can you help resolving this for me. Here is the code snippet: scala> def catalog = s"""{ ||"table":{"nam

Spark Streaming: NullPointerException when restoring Spark Streaming job from hdfs/s3 checkpoint

2017-05-16 Thread Richard Moorhead
Im having some difficulty reliably restoring a streaming job from a checkpoint. When restoring a streaming job constructed from the following snippet, I receive NullPointerException's when `map` is called on the the restored RDD. lazy val ssc = StreamingContext.getOrCreate(checkpointDir, crea

NullPointerException while joining two avro Hive tables

2017-02-04 Thread Понькин Алексей
Hi, I have a table in Hive(data is stored as avro files). Using python spark shell I am trying to join two datasets events = spark.sql('select * from mydb.events') intersect = events.where('attr2 in (5,6,7) and attr1 in (1,2,3)') intersect.count() But I am constantly receiving the following j

[ML - Intermediate - Debug] - Loading Customized Transformers in Apache Spark raised a NullPointerException

2017-01-24 Thread Saulo Ricci
Hi, sorry if I'm being short here. I'm facing the issue related in this link <http://stackoverflow.com/questions/41844035/loading-customized-transformers-in-apache-spark-raised-a-nullpointerexception>, I would really appreciate any help from the team and happy to talk and discuss

Re: Spark-Sql 2.0 nullpointerException

2016-10-12 Thread Selvam Raman
> execute query outside of foreach function it is working fine. > throws nullpointerexception within data frame.foreach function. > > code snippet: > > String CITATION_QUERY = "select c.citation_num, c.title, c.publisher from > test c"; > > Dat

Spark-Sql 2.0 nullpointerException

2016-10-12 Thread Selvam Raman
Hi , I am reading parquet file and creating temp table. when i am trying to execute query outside of foreach function it is working fine. throws nullpointerexception within data frame.foreach function. code snippet: String CITATION_QUERY = "select c.citation_num, c.title, c.publisher from

question about Broadcast value NullPointerException

2016-08-23 Thread Chong Zhang
Hello, I'm using Spark streaming to process kafka message, and wants to use a prop file as the input and broadcast the properties: val props = new Properties() props.load(new FileInputStream(args(0))) val sc = initSparkContext() val propsBC = sc.broadcast(props) println(s"propFileBC 1: " + propsB

Re: Matrix Factorization Model model.save error "NullPointerException"

2016-07-12 Thread Zhou (Joe) Xing
Anyone may have an idea on what this NPE issue below is about? Thank you! cheers zhou On Jul 11, 2016, at 11:27 PM, Zhou (Joe) Xing mailto:joe.x...@nextev.com>> wrote: Hi Guys, I searched for the archive and also googled this problem when saving the ALS trained Matrix Factorization Model to

Matrix Factorization Model model.save error "NullPointerException"

2016-07-11 Thread Zhou (Joe) Xing
Hi Guys, I searched for the archive and also googled this problem when saving the ALS trained Matrix Factorization Model to local file system using Model.save() method, I found some hints such as partition the model before saving, etc. But it does not seem to solve my problem. I’m always getti

Re: NullPointerException when starting StreamingContext

2016-06-24 Thread Sunita Arvind
I was able to resolve the serialization issue. The root cause was, I was accessing the config values within foreachRDD{}. The solution was to extract the values from config outside the foreachRDD scope and send in values to the loop directly. Probably something obvious as we cannot have nested dist

Re: NullPointerException when starting StreamingContext

2016-06-24 Thread Cody Koeninger
That looks like a classpath problem. You should not have to include the kafka_2.10 artifact in your pom, spark-streaming-kafka_2.10 already has a transitive dependency on it. That being said, 0.8.2.1 is the correct version, so that's a little strange. How are you building and submitting your app

Re: NullPointerException when starting StreamingContext

2016-06-23 Thread Sunita Arvind
Also, just to keep it simple, I am trying to use 1.6.0CDH5.7.0 in the pom.xml as the cluster I am trying to run on is CDH5.7.0 with spark 1.6.0. Here is my pom setting: 1.6.0-cdh5.7.0 org.apache.spark spark-core_2.10 ${cdh.spark.version} compile org.apache.spark spark-

Re: NullPointerException when starting StreamingContext

2016-06-22 Thread Ted Yu
Which Scala version / Spark release are you using ? Cheers On Wed, Jun 22, 2016 at 8:20 PM, Sunita Arvind wrote: > Hello Experts, > > I am getting this error repeatedly: > > 16/06/23 03:06:59 ERROR streaming.StreamingContext: Error starting the > context, marking it as stopped > java.lang.Null

NullPointerException when starting StreamingContext

2016-06-22 Thread Sunita Arvind
Hello Experts, I am getting this error repeatedly: 16/06/23 03:06:59 ERROR streaming.StreamingContext: Error starting the context, marking it as stopped java.lang.NullPointerException at com.typesafe.config.impl.SerializedConfigValue.writeOrigin(SerializedConfigValue.java:202) at

Re: getting NullPointerException while doing left outer join

2016-05-06 Thread Adam Westerman
For anyone interested, the problem ended up being that in some rare cases, the value from the pair RDD on the right side of the left outer join was Java's null. The Spark optionToOptional method attempted to apply Some() to null, which caused the NPE to be thrown. The lesson is to filter out any

Re: getting NullPointerException while doing left outer join

2016-05-06 Thread Adam Westerman
Hi Ted, I am working on replicating the problem on a smaller scale. I saw that Spark 2.0 is moving to Java 8 Optional instead of Guava Optional, but in the meantime I'm stuck with 1.6.1. -Adam On Fri, May 6, 2016 at 9:40 AM, Ted Yu wrote: > Is it possible to write a short test which exhibits

Re: getting NullPointerException while doing left outer join

2016-05-06 Thread Ted Yu
Is it possible to write a short test which exhibits this problem ? For Spark 2.0, this part of code has changed: [SPARK-4819] Remove Guava's "Optional" from public API FYI On Fri, May 6, 2016 at 6:57 AM, Adam Westerman wrote: > Hi, > > I’m attempting to do a left outer join in Spark, and I’m

getting NullPointerException while doing left outer join

2016-05-06 Thread Adam Westerman
Hi, I’m attempting to do a left outer join in Spark, and I’m getting an NPE that appears to be due to some Spark Java API bug. (I’m running Spark 1.6.0 in local mode on a Mac). For a little background, the left outer join returns all keys from the left side of the join regardless of whether or no

MLLIB LDA throws NullPointerException

2016-04-06 Thread jamborta
: java.lang.NullPointerException -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/MLLIB-LDA-throws-NullPointerException-tp26686.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To

Re: NullPointerException

2016-03-12 Thread Prabhu Joseph
Simple, add a Debug statement in ExternalSorter.scala before the line which throws Null and recompile Spark-assembly.jar and confirm the source of Null. On Saturday, March 12, 2016, saurabh guru wrote: > I don't see how that would be possible. I am reading from a live stream of > data through k

Re: NullPointerException

2016-03-12 Thread saurabh guru
I don't see how that would be possible. I am reading from a live stream of data through kafka. On Sat 12 Mar, 2016 20:28 Ted Yu, wrote: > Interesting. > If kv._1 was null, shouldn't the NPE have come from getPartition() (line > 105) ? > > Was it possible that records.next() returned null ? > > O

Re: NullPointerException

2016-03-12 Thread Ted Yu
Interesting. If kv._1 was null, shouldn't the NPE have come from getPartition() (line 105) ? Was it possible that records.next() returned null ? On Fri, Mar 11, 2016 at 11:20 PM, Prabhu Joseph wrote: > Looking at ExternalSorter.scala line 192, i suspect some input record has > Null key. > > 189

Re: NullPointerException

2016-03-11 Thread Saurabh Guru
I am using the following versions: org.apache.spark spark-streaming_2.10 1.6.0 org.apache.spark spark-streaming-kafka_2.10

Re: NullPointerException

2016-03-11 Thread Ted Yu
Which Spark release do you use ? I wonder if the following may have fixed the problem: SPARK-8029 Robust shuffle writer JIRA is down, cannot check now. On Fri, Mar 11, 2016 at 11:01 PM, Saurabh Guru wrote: > I am seeing the following exception in my Spark Cluster every few days in > production

Re: NullPointerException

2016-03-11 Thread Prabhu Joseph
Looking at ExternalSorter.scala line 192, i suspect some input record has Null key. 189 while (records.hasNext) { 190addElementsRead() 191kv = records.next() 192map.changeValue((getPartition(kv._1), kv._1), update) On Sat, Mar 12, 2016 at 12:48 PM, Prabhu Joseph wrote: > Looking

Re: NullPointerException

2016-03-11 Thread Prabhu Joseph
Looking at ExternalSorter.scala line 192 189 while (records.hasNext) { addElementsRead() kv = records.next() map.changeValue((getPartition(kv._1), kv._1), update) maybeSpillCollection(usingMap = true) } On Sat, Mar 12, 2016 at 12:31 PM, Saurabh Guru wrote: > I am seeing the following exception

NullPointerException

2016-03-11 Thread Saurabh Guru
I am seeing the following exception in my Spark Cluster every few days in production. 2016-03-12 05:30:00,541 - WARN TaskSetManager - Lost task 0.0 in stage 12528.0 (TID 18792, ip-1X-1XX-1-1XX.us -west-1.compute.internal ): java.lang.NullPointerException at

Re: Streaming mapWithState API has NullPointerException

2016-02-23 Thread Tathagata Das
cially TD and Spark Streaming folks: >>> >>> I am using the new Spark 1.6.0 Streaming mapWithState API, in order to >>> accomplish a streaming joining task with data. >>> >>> Things work fine on smaller sets of data, but on a single-node large >>> cl

Re: Streaming mapWithState API has NullPointerException

2016-02-22 Thread Aris
work fine on smaller sets of data, but on a single-node large >> cluster with JSON strings amounting to 2.5 GB problems start to occur, I >> get a NullPointerException. It appears to happen in my code when I call >> DataFrame.write.parquet() >> >> I am reliably reprod

Re: Streaming mapWithState API has NullPointerException

2016-02-22 Thread Tathagata Das
State API, in order to > accomplish a streaming joining task with data. > > Things work fine on smaller sets of data, but on a single-node large > cluster with JSON strings amounting to 2.5 GB problems start to occur, I > get a NullPointerException. It appears to happe

Streaming mapWithState API has NullPointerException

2016-02-22 Thread Aris
2.5 GB problems start to occur, I get a NullPointerException. It appears to happen in my code when I call DataFrame.write.parquet() I am reliably reproducing this, and it appears to be internal to mapWithState -- I don't know what else I can do to make progress, any thoughts? Here is the

Re: Random Forest FeatureImportance throwing NullPointerException

2016-01-14 Thread Bryan Cutler
> > *From:* Bryan Cutler [mailto:cutl...@gmail.com] > *Sent:* Thursday, January 14, 2016 2:19 PM > *To:* Rachana Srivastava > *Cc:* user@spark.apache.org; d...@spark.apache.org > *Subject:* Re: Random Forest FeatureImportance throwing > NullPointerException > > > > Hi Rac

Re: Random Forest FeatureImportance throwing NullPointerException

2016-01-14 Thread Bryan Cutler
t; org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances(RandomForestClassifier.scala:237) > > at > com.markmonitor.antifraud.ce.ml.CheckFeatureImportance.main( > *CheckFeatureImportance.java:49*) > > > > *From:* Rachana Srivastava > *Sent:* Wednesday, January 13, 201

RE: Random Forest FeatureImportance throwing NullPointerException

2016-01-14 Thread Rachana Srivastava
at com.markmonitor.antifraud.ce.ml.CheckFeatureImportance.main(CheckFeatureImportance.java:49) From: Rachana Srivastava Sent: Wednesday, January 13, 2016 3:30 PM To: 'user@spark.apache.org'; 'd...@spark.apache.org' Subject: Random Forest FeatureImportance throwing NullPointerException

Random Forest FeatureImportance throwing NullPointerException

2016-01-13 Thread Rachana Srivastava
I have a Random forest model for which I am trying to get the featureImportance vector. Map categoricalFeaturesParam = new HashMap<>(); scala.collection.immutable.Map categoricalFeatures = (scala.collection.immutable.Map) scala.collection.immutable.Map$.MODULE$.apply(JavaConversions.mapAsScalaM

DataFrame withColumnRenamed throwing NullPointerException

2016-01-05 Thread Prasad Ravilla
I am joining two data frames as shown in the code below. This is throwing NullPointerException. I have a number of different join throughout the program and the SparkContext throws this NullPointerException on a randomly on one of the joins. The two data frames are very large data frames

Re: NullPointerException with joda time

2015-11-12 Thread Koert Kuipers
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>>>> at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87) >>>>> at org.apache.spark.rdd.RDD

Re: NullPointerException with joda time

2015-11-12 Thread Ted Yu
unTask(ShuffleMapTask.scala:41) >>>> at org.apache.spark.scheduler.Task.run(Task.scala:64) >>>> at >>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.r

Re: NullPointerException with joda time

2015-11-12 Thread Romain Sagean
IndependentStages(DAGScheduler.scala:1210) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1199) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.sca

Re: NullPointerException with joda time

2015-11-11 Thread Ted Yu
DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) >> at >> org.apache.spark.sch

Re: NullPointerException with joda time

2015-11-10 Thread Ted Yu
ager: Lost task 209.0 in stage 3.0 (TID > 804, R610-2.pro.hupi.loc): TaskKilled (killed intentionally) > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/

Re: NullPointerException with joda time

2015-11-10 Thread Romain Sagean
2015-11-10 18:39 GMT+01:00 Ted Yu : > Can you show the stack trace for the NPE ? > > Which release of Spark are you using ? > > Cheers > > On Tue, Nov 10, 2015 at 8:20 AM, romain sagean > wrote: > >> Hi community, >> I try to apply the function below duri

Re: NullPointerException with joda time

2015-11-10 Thread Ted Yu
Can you show the stack trace for the NPE ? Which release of Spark are you using ? Cheers On Tue, Nov 10, 2015 at 8:20 AM, romain sagean wrote: > Hi community, > I try to apply the function below during a flatMapValues or a map but I > get a nullPointerException with the plusDays(1).

NullPointerException with joda time

2015-11-10 Thread romain sagean
Hi community, I try to apply the function below during a flatMapValues or a map but I get a nullPointerException with the plusDays(1). What did I miss ? def allDates(dateSeq: Seq[DateTime], dateEnd: DateTime): Seq[DateTime] = { if (dateSeq.last.isBefore(dateEnd)){ allDates(dateSeq

Re: NullPointerException when cache DataFrame in Java (Spark1.5.1)

2015-10-29 Thread Romi Kuntsman
> > BUT, after change limit(500) to limit(1000). The code report > NullPointerException. > I had a similar situation, and the problem was with a certain record. Try to find which records are returned when you limit to 1000 but not returned when you limit to 500. Could it be a NPE

Re: NullPointerException when cache DataFrame in Java (Spark1.5.1)

2015-10-29 Thread Zhang, Jingyu
Thanks Romi, I resize the dataset to 7MB, however, the code show NullPointerException exception as well. Did you try to cache a DataFrame with just a single row? Yes, I tried. But, Same problem. . Do you rows have any columns with null values? No, I had filter out null values before cache the

Re: NullPointerException when cache DataFrame in Java (Spark1.5.1)

2015-10-28 Thread Romi Kuntsman
Did you try to cache a DataFrame with just a single row? Do you rows have any columns with null values? Can you post a code snippet here on how you load/generate the dataframe? Does dataframe.rdd.cache work? *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Thu, Oct 29, 2015 at 4:33

NullPointerException when cache DataFrame in Java (Spark1.5.1)

2015-10-28 Thread Zhang, Jingyu
It is not a problem to use JavaRDD.cache() for 200M data (all Objects read form Json Format). But when I try to use DataFrame.cache(), It shown exception in below. My machine can cache 1 G data in Avro format without any problem. 15/10/29 13:26:23 INFO GeneratePredicate: Code generated in 154.531

NullPointerException when adding to accumulator

2015-10-14 Thread Sela, Amit
I'm running a simple streaming application that reads from Kafka, maps the events and prints them and I'm trying to use accumulators to count the number of mapped records. While this works in standalone(IDE), when submitting to YARN I get NullPointerException on accumulator

Re: yarn-cluster mode throwing NullPointerException

2015-10-11 Thread Venkatakrishnan Sowrirajan
Hi Rachana, Are you by any chance saying something like this in your code ​? ​ "sparkConf.setMaster("yarn-cluster");" ​SparkContext is not supported with yarn-cluster mode.​ I think you are hitting this bug -- > https://issues.apache.org/jira/browse/SPARK-7504. This got fixed in Spark-1.4.0,

yarn-cluster mode throwing NullPointerException

2015-10-11 Thread Rachana Srivastava
I am trying to submit a job using yarn-cluster mode using spark-submit command. My code works fine when I use yarn-client mode. Cloudera Version: CDH-5.4.7-1.cdh5.4.7.p0.3 Command Submitted: spark-submit --class "com.markmonitor.antifraud.ce.KafkaURLStreaming" \ --driver-java-options "-Dlog4j

Re: Why Checkpoint is throwing "actor.OneForOneStrategy: NullPointerException"

2015-09-25 Thread Uthayan Suthakar
Thank you Tathagata and Therry for your response. You guys were absolutely correct that I created a dummy Dstream (to prevent Flume channel filling up) and counted the messages but I didn't output(print), hence is why it reported that error. Since I called print(), the error is no longer is being

Re: Why Checkpoint is throwing "actor.OneForOneStrategy: NullPointerException"

2015-09-24 Thread Terry Hoo
I met this before: in my program, some DStreams are not initialized since they are not in the path of of output. You can check if you are the same case. Thanks! - Terry On Fri, Sep 25, 2015 at 10:22 AM, Tathagata Das wrote: > Are you by any chance setting DStream.remember() with null? > > O

Re: Why Checkpoint is throwing "actor.OneForOneStrategy: NullPointerException"

2015-09-24 Thread Tathagata Das
Are you by any chance setting DStream.remember() with null? On Thu, Sep 24, 2015 at 5:02 PM, Uthayan Suthakar < uthayan.sutha...@gmail.com> wrote: > Hello all, > > My Stream job is throwing below exception at every interval. It is first > deleting the the checkpoint file and then it's trying to c

Why Checkpoint is throwing "actor.OneForOneStrategy: NullPointerException"

2015-09-24 Thread Uthayan Suthakar
Hello all, My Stream job is throwing below exception at every interval. It is first deleting the the checkpoint file and then it's trying to checkpoint, is this normal behaviour? I'm using Spark 1.3.0. Do you know what may cause this issue? 15/09/24 16:35:55 INFO scheduler.TaskSetManager: Finishe

Re: NullPointerException inside RDD when calling sc.textFile

2015-07-23 Thread Akhil Das
t; var name = a._2.mkString(",") > (a._1, name) > } > > data.foreach { a => > var file = sc.textFile(a._2) > println(file.count) > } > > And I get SparkException - NullPointerException when I try to call > textFile. > The error stack refers to an

NullPointerException inside RDD when calling sc.textFile

2015-07-21 Thread MorEru
file.count) } And I get SparkException - NullPointerException when I try to call textFile. The error stack refers to an Iterator inside the RDD. I am not able to understand the error - 15/07/21 15:37:37 INFO TaskSchedulerImpl: Removed TaskSet 65.0, whose tasks have all completed, from pool org.a

Re: NullPointerException with functions.rand()

2015-06-12 Thread Ted Yu
t >> org.apache.spark.sql.catalyst.expressions.RDG.rng$lzycompute(random.scala:39) >> at org.apache.spark.sql.catalyst.expressions.RDG.rng(random.scala:39) >> .. >> >> >> Does any one know why? >> >> Thanks. >> >> Justin >> >> -

Re: hiveContext.sql NullPointerException

2015-06-11 Thread patcharee
n 6/7/15 9:48 PM, patcharee wrote: Hi, How can I expect to work on HiveContext on the executor? If only the driver can see HiveContext, does it mean I have to collect all datasets (very large) to the driver and use HiveContext there? It will be memory overload on the driver and fail. BR, Patc

Re: NullPointerException with functions.rand()

2015-06-10 Thread Ted Yu
; at > org.apache.spark.sql.catalyst.expressions.RDG.rng$lzycompute(random.scala:39) > at org.apache.spark.sql.catalyst.expressions.RDG.rng(random.scala:39) > .. > > > Does any one know why? > > Thanks. > > Justin > > -- >

NullPointerException with functions.rand()

2015-06-10 Thread Justin Yip
talyst.expressions.RDG.rng$lzycompute(random.scala:39) at org.apache.spark.sql.catalyst.expressions.RDG.rng(random.scala:39) .. Does any one know why? Thanks. Justin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-with-functions-rand-

Re: hiveContext.sql NullPointerException

2015-06-08 Thread Cheng Lian
utor side, where no viable HiveContext instance exists. Cheng On 6/7/15 10:06 AM, patcharee wrote: Hi, I try to insert data into a partitioned hive table. The groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toLi

Re: hiveContext.sql NullPointerException

2015-06-08 Thread patcharee
e groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But the hiveContext.sql throws NullPointerException, see below. Any suggestions? What could be wrong? Thanks! val varWHeightFlatRDD = varWHeightRDD.f

Re: hiveContext.sql NullPointerException

2015-06-07 Thread Cheng Lian
a into a partitioned hive table. The groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But the hiveContext.sql throws NullPointerException, see below. Any suggestions? What could be wrong? Tha

Re: NullPointerException SQLConf.setConf

2015-06-07 Thread Cheng Lian
Are you calling hiveContext.sql within an RDD.map closure or something similar? In this way, the call actually happens on executor side. However, HiveContext only exists on the driver side. Cheng On 6/4/15 3:45 PM, patcharee wrote: Hi, I am using Hive 0.14 and spark 0.13. I got java.lang.Nu

Re: hiveContext.sql NullPointerException

2015-06-07 Thread patcharee
, patcharee wrote: Hi, I try to insert data into a partitioned hive table. The groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But the hiveContext.sql throws NullPointerException, see below. Any

Re: hiveContext.sql NullPointerException

2015-06-07 Thread Cheng Lian
wrote: Hi, I try to insert data into a partitioned hive table. The groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But the hiveContext.sql throws NullPointerException, see below. Any suggestions

hiveContext.sql NullPointerException

2015-06-06 Thread patcharee
Hi, I try to insert data into a partitioned hive table. The groupByKey is to combine dataset into a partition of the hive table. After the groupByKey, I converted the iterable[X] to DB by X.toList.toDF(). But the hiveContext.sql throws NullPointerException, see below. Any suggestions? What

NullPointerException SQLConf.setConf

2015-06-04 Thread patcharee
Hi, I am using Hive 0.14 and spark 0.13. I got java.lang.NullPointerException when inserted into hive. Any suggestions please. hiveContext.sql("INSERT OVERWRITE table 4dim partition (zone=" + ZONE + ",z=" + zz + ",year=" + YEAR + ",month=" + MONTH + ") " + "select date, hh, x, y, hei

NullPointerException when accessing broadcast variable in DStream

2015-05-18 Thread hotienvu
g LIMIT is not accessible within the stream I'm running spark 1.3.1 in standalone mode with 2 nodes cluster. I tried with spark-shell and it works fine. Please help! Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-when-access

NullPointerException while creating DataFrame from an S3 Avro Object

2015-05-13 Thread Mohammad Tariq
Hi List, I have just started using Spark and trying to create DataFrame from an Avro file stored in Amazon S3. I am using *Spark-Avro* library for this. The code which I'm using is shown below. Nothing fancy, just the basic prototype as shown on the Spark

NullPointerException with Avro + Spark.

2015-05-01 Thread ๏̯͡๏
vi) } Its a simple mapping of input records to (itemId, record) I found this http://stackoverflow.com/questions/23962796/kryo-readobject-cause-nullpointerexception-with-arraylist and http://apache-spark-user-list.1001560.n3.nabble.com/Kryo-NPE-with-Array-td19797.html Looks like

NullPointerException in TaskSetManager

2015-02-26 Thread gtinside
Hi , I am trying to run a simple hadoop job (that uses CassandraHadoopInputOutputWriter) on spark (v1.2 , Hadoop v 1.x) but getting NullPointerException in TaskSetManager WARN 2015-02-26 14:21:43,217 [task-result-getter-0] TaskSetManager - Lost task 14.2 in stage 0.0 (TID 29, devntom003

Re: NullPointerException in ApplicationMaster

2015-02-25 Thread Zhan Zhang
Look at the trace again. It is a very weird error. The SparkSubmit is running on client side, but YarnClusterSchedulerBackend is supposed in running in YARN AM. I suspect you are running the cluster with yarn-client mode, but in JavaSparkContext you set "yarn-cluster”. As a result, spark contex

Re: NullPointerException in ApplicationMaster

2015-02-25 Thread Zhan Zhang
Hi Mate, When you initialize the JavaSparkContext, you don’t need to specify the mode “yarn-cluster”. I suspect that is the root cause. Thanks. Zhan Zhang On Feb 25, 2015, at 10:12 AM, gulyasm mailto:mgulya...@gmail.com>> wrote: JavaSparkContext.

NullPointerException in ApplicationMaster

2015-02-25 Thread gulyasm
Hi all, I am trying to run a Spark Java application on EMR, but I keep getting NullPointerException from the Application master (spark version on EMR: 1.2). The stacktrace is below. I also tried to run the application on Hortonworks Sandbox (2.2) with spark 1.2, following the blogpost (http

  1   2   >