[jira] [Comment Edited] (SPARK-20006) Separate threshold for broadcast and shuffled hash join
[ https://issues.apache.org/jira/browse/SPARK-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931054#comment-15931054 ] Zhan Zhang edited comment on SPARK-20006 at 3/18/17 4:42 AM: - The default ShuffledHashJoin threshold can fallback to the broadcast one. A separate configuration does provide us opportunities to optimize the join dramatically. It would be great if CBO can automatically find the best strategy. But probably I miss something. Currently the CBO does not collect right statistics, especially for partitioned table. I have opened a JIRA for that issue as well. https://issues.apache.org/jira/browse/SPARK-19890 was (Author: zhzhan): The default ShuffledHashJoin threshold can fallback to the broadcast one. A separate configuration does provide us opportunities to optimize the join dramatically. It would be great if CBO can automatically find the best strategy. But probably I miss something. Currently the CBO does not collect right statistics, especially for partitioned table. https://issues.apache.org/jira/browse/SPARK-19890 > Separate threshold for broadcast and shuffled hash join > --- > > Key: SPARK-20006 > URL: https://issues.apache.org/jira/browse/SPARK-20006 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Zhan Zhang >Priority: Minor > > Currently both canBroadcast and canBuildLocalHashMap use the same > configuration: AUTO_BROADCASTJOIN_THRESHOLD. > But the memory model may be different. For broadcast, currently the hash map > is always build on heap. For shuffledHashJoin, the hash map may be build on > heap(longHash), or off heap(other map if off heap is enabled). The same > configuration makes the configuration hard to tune (how to allocate memory > onheap/offheap). Propose to use different configuration. Please comments > whether it is reasonable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20006) Separate threshold for broadcast and shuffled hash join
[ https://issues.apache.org/jira/browse/SPARK-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931054#comment-15931054 ] Zhan Zhang commented on SPARK-20006: The default ShuffledHashJoin threshold can fallback to the broadcast one. A separate configuration does provide us opportunities to optimize the join dramatically. It would be great if CBO can automatically find the best strategy. But probably I miss something. Currently the CBO does not collect right statistics, especially for partitioned table. https://issues.apache.org/jira/browse/SPARK-19890 > Separate threshold for broadcast and shuffled hash join > --- > > Key: SPARK-20006 > URL: https://issues.apache.org/jira/browse/SPARK-20006 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Zhan Zhang >Priority: Minor > > Currently both canBroadcast and canBuildLocalHashMap use the same > configuration: AUTO_BROADCASTJOIN_THRESHOLD. > But the memory model may be different. For broadcast, currently the hash map > is always build on heap. For shuffledHashJoin, the hash map may be build on > heap(longHash), or off heap(other map if off heap is enabled). The same > configuration makes the configuration hard to tune (how to allocate memory > onheap/offheap). Propose to use different configuration. Please comments > whether it is reasonable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20006) Separate threshold for broadcast and shuffled hash join
[ https://issues.apache.org/jira/browse/SPARK-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931050#comment-15931050 ] Takeshi Yamamuro commented on SPARK-20006: -- I feel more options we have for controlling plan strategies, more difficult users use DataFrame/Dataset. Essentially, CBO should control these kinds of things, I think. > Separate threshold for broadcast and shuffled hash join > --- > > Key: SPARK-20006 > URL: https://issues.apache.org/jira/browse/SPARK-20006 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Zhan Zhang >Priority: Minor > > Currently both canBroadcast and canBuildLocalHashMap use the same > configuration: AUTO_BROADCASTJOIN_THRESHOLD. > But the memory model may be different. For broadcast, currently the hash map > is always build on heap. For shuffledHashJoin, the hash map may be build on > heap(longHash), or off heap(other map if off heap is enabled). The same > configuration makes the configuration hard to tune (how to allocate memory > onheap/offheap). Propose to use different configuration. Please comments > whether it is reasonable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20009) Use user-friendly DDL formats for defining a schema in user-facing APIs
[ https://issues.apache.org/jira/browse/SPARK-20009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931044#comment-15931044 ] Takeshi Yamamuro commented on SPARK-20009: -- Does this make sense? cc: [~smilegator] My prototype is here: https://github.com/apache/spark/compare/master...maropu:UserDDLForSchema > Use user-friendly DDL formats for defining a schema in user-facing APIs > > > Key: SPARK-20009 > URL: https://issues.apache.org/jira/browse/SPARK-20009 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.1.0 >Reporter: Takeshi Yamamuro > > In https://issues.apache.org/jira/browse/SPARK-19830, we add a new API in the > DDL parser to convert a DDL string into a schema. Then, we can use DDL > formats in existing some APIs, e.g., functions.from_json > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L3062. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20009) Use user-friendly DDL formats for defining a schema in user-facing APIs
Takeshi Yamamuro created SPARK-20009: Summary: Use user-friendly DDL formats for defining a schema in user-facing APIs Key: SPARK-20009 URL: https://issues.apache.org/jira/browse/SPARK-20009 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.1.0 Reporter: Takeshi Yamamuro In https://issues.apache.org/jira/browse/SPARK-19830, we add a new API in the DDL parser to convert a DDL string into a schema. Then, we can use DDL formats in existing some APIs, e.g., functions.from_json https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L3062. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18886) Delay scheduling should not delay some executors indefinitely if one task is scheduled before delay timeout
[ https://issues.apache.org/jira/browse/SPARK-18886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931009#comment-15931009 ] Kay Ousterhout commented on SPARK-18886: Sorry for the slow response here! I realized this is the same issue as SPARK-11460 (although that JIRA proposed a slightly different solution), which stalled for reasons that are completely my fault (I neglected it because I couldn't think of a practical way of solving it). Imran, unfortunately I don't think your latest idea will quite work. Delay scheduling was originally intended for situations where the number of slots that a particular job could use was limited by a fairness policy. In that case, it can be better to wait a bit for a "better" slot (i.e., one that satisfies locality preferences). In particular, if you never wait, you end up with this "sticky slot" issue where tasks for a job keep finishing up in a "bad" slot (one with no locality preferences), and then they'll be re-offered to the same job, which will again accept the bad slot. If the job just waited a bit, it could get a better slot (e.g., as a result of tasks from another job finishing). [1] This relates to your idea because of the following situation: suppose you have a cluster with 10 machines, the job has locality preferences for 5 of them (with ids 1, 2, 3, 4, 5), and fairness dictates that the job can only use 3 slots at a time (e.g., it's sharing equally with 2 other jobs). Suppose that for a long time, the job has been running tasks on slots 1, 2, and 3 (so local slots). At this point, the times for machines 6, 7, 8, 9, and 10 will have expired, because the job has been running for a while. But if the job is now offered a slot on one of those non-local machines (e.g., 6), the job hasn't been waiting long for non-local resources: until this point, it's been running it's full share of 3 slots at a time, and it's been doing so on machines that satisfy locality preferences. So, we shouldn't accept that slot on machine 6 -- we should wait a bit to see if we can get a slot on 1, 2, 3, 4, or 5. The solution I proposed (in a long PR comment) for the other JIRA is: if the task set is using fewer than the number of slots it could be using (where “# slots it could be using” is all of the slots in the cluster if the job is running alone, or the job’s fair share, if it’s not) for some period of time, increase the locality level. The problem with that solution is that I thought it was completely impractical to determine the number of slots a TSM "should" be allowed to use. However, after thinking about this more today, I think we might be able to do this in a practical way: - First, I thought that we could use information about when offers are rejected to determine this (e.g., if you've been rejecting offers for a while, then you're not using your fair share). But the problem here is that it's not easy to determine when you *are* using your fair / allowed share: accepting a single offer doesn't necessarily mean that you're now using the allowed share. This is precisely the problem with the current approach, hence this JIRA. - v1: one possible proxy for this is if there are slots that are currently available that haven't been accepted by any job. The TaskSchedulerImpl could feasibly pass this information to each TaskSetManager, and the TSM could use it to update it's delay timer: something like only reset the delay timer to 0 if (a) the TSM accepts an offer and (b) the flag passed by the TSM indicates that there are no other unused slots in the cluster. This fixes the problem described in the JIRA: in that case, the flag would indicate that there *were* other unused slots, even though a task got successfully scheduled with this offer, so the delay timer wouldn't be reset, and would eventually correctly expire. - v2: The problem with v1 is that it doesn't correctly handle situations where e.g., you have two jobs A and B with equal shares. B is "greedy" and will accept any slot (e.g., it's a reduce stage), and A is doing delay scheduling. In this case, A might have much less than its share, but the flag from the TaskSchedulerImpl would indicate that there were no other free slots in the cluster, so the delay timer wouldn't ever expire. I suspect we could handle this (e.g., with some logic in the TaskSchedulerImpl to detect when a particular TSM is getting starved: when it keeps rejecting offers that are later accepted by someone else) but before thinking about this further, I wanted to run the general idea by you to see what your thoughts are. [1] There's a whole side question / discussion of how often this is useful for Spark at all. It can be useful if you're running in a shared cluster where e.g. Yarn might be assigning you more slots over time, and it's also useful when a single Spark context is being shared across many
[jira] [Resolved] (SPARK-11460) Locality waits should be based on task set creation time, not last launch time
[ https://issues.apache.org/jira/browse/SPARK-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-11460. Resolution: Duplicate > Locality waits should be based on task set creation time, not last launch time > -- > > Key: SPARK-11460 > URL: https://issues.apache.org/jira/browse/SPARK-11460 > Project: Spark > Issue Type: Bug > Components: Scheduler >Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.2.0, 1.2.1, 1.2.2, > 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.5.0, 1.5.1 > Environment: YARN >Reporter: Shengyue Ji > Original Estimate: 2h > Remaining Estimate: 2h > > Spark waits for spark.locality.waits period before going from RACK_LOCAL to > ANY when selecting an executor for assignment. The timeout was essentially > reset each time a new assignment is made. > We were running Spark streaming on Kafka with a 10 second batch window on 32 > Kafka partitions with 16 executors. All executors were in the ANY group. At > one point one RACK_LOCAL executor was added and all tasks were assigned to > it. Each task took about 0.6 second to process, resetting the > spark.locality.wait timeout (3000ms) repeatedly. This caused the whole > process to under utilize resources and created an increasing backlog. > spark.locality.wait should be based on the task set creation time, not last > launch time so that after 3000ms of initial creation, all executors can get > tasks assigned to them. > We are specifying a zero timeout for now as a workaround to disable locality > optimization. > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L556 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-20000) Spark Hive tests aborted due to lz4-java on ppc64le
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931011#comment-15931011 ] Takeshi Yamamuro edited comment on SPARK-2 at 3/18/17 2:22 AM: --- oh! congrats! was (Author: maropu): oh! congrat! > Spark Hive tests aborted due to lz4-java on ppc64le > --- > > Key: SPARK-2 > URL: https://issues.apache.org/jira/browse/SPARK-2 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 > Environment: Ubuntu 14.04 ppc64le > $ java -version > openjdk version "1.8.0_111" > OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14) > OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) >Reporter: Sonia Garudi > Labels: ppc64le > Attachments: hs_err_pid.log > > > The tests are getting aborted in Spark Hive project with the following error : > {code:borderStyle=solid} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x3fff94dbf114, pid=6160, tid=0x3fff6efef1a0 > # > # JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build > 1.8.0_111-8u111-b14-3~14.04.1-b14) > # Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-ppc64 > compressed oops) > # Problematic frame: > # V [libjvm.so+0x56f114] > {code} > In the thread log file, I found the following traces : > Event: 3669.042 Thread 0x3fff89976800 Exception 'java/lang/NoClassDefFoundError': Could not initialize class > net.jpountz.lz4.LZ4JNI> (0x00079fcda3b8) thrown at > [/build/openjdk-8-fVIxxI/openjdk-8-8u111-b14/src/hotspot/src/share/vm/oops/instanceKlass.cpp, > line 890] > This error is due to the lz4-java (version 1.3.0), which doesn’t have support > for ppc64le.PFA the thread log file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20000) Spark Hive tests aborted due to lz4-java on ppc64le
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931011#comment-15931011 ] Takeshi Yamamuro commented on SPARK-2: -- oh! congrat! > Spark Hive tests aborted due to lz4-java on ppc64le > --- > > Key: SPARK-2 > URL: https://issues.apache.org/jira/browse/SPARK-2 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 > Environment: Ubuntu 14.04 ppc64le > $ java -version > openjdk version "1.8.0_111" > OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14) > OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) >Reporter: Sonia Garudi > Labels: ppc64le > Attachments: hs_err_pid.log > > > The tests are getting aborted in Spark Hive project with the following error : > {code:borderStyle=solid} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x3fff94dbf114, pid=6160, tid=0x3fff6efef1a0 > # > # JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build > 1.8.0_111-8u111-b14-3~14.04.1-b14) > # Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-ppc64 > compressed oops) > # Problematic frame: > # V [libjvm.so+0x56f114] > {code} > In the thread log file, I found the following traces : > Event: 3669.042 Thread 0x3fff89976800 Exception 'java/lang/NoClassDefFoundError': Could not initialize class > net.jpountz.lz4.LZ4JNI> (0x00079fcda3b8) thrown at > [/build/openjdk-8-fVIxxI/openjdk-8-8u111-b14/src/hotspot/src/share/vm/oops/instanceKlass.cpp, > line 890] > This error is due to the lz4-java (version 1.3.0), which doesn’t have support > for ppc64le.PFA the thread log file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20008) hiveContext.emptyDataFrame.except(hiveContext.emptyDataFrame).count() returns 1
Ravindra Bajpai created SPARK-20008: --- Summary: hiveContext.emptyDataFrame.except(hiveContext.emptyDataFrame).count() returns 1 Key: SPARK-20008 URL: https://issues.apache.org/jira/browse/SPARK-20008 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.0.2 Reporter: Ravindra Bajpai hiveContext.emptyDataFrame.except(hiveContext.emptyDataFrame).count() yields 1 against expected 0. This was not the case with spark 1.5.2. This is an api change from usage point of view and hence I consider this as a bug. May be a boundary case, not sure. Work around - For now I check the counts != 0 before this operation. Not good for performance. Hence creating a jira to track it. As Young Zhang explained in reply to my mail - Starting from Spark 2, these kind of operation are implemented in left anti join, instead of using RDD operation directly. Same issue also on sqlContext. scala> spark.version res25: String = 2.0.2 spark.sqlContext.emptyDataFrame.except(spark.sqlContext.emptyDataFrame).explain(true) == Physical Plan == *HashAggregate(keys=[], functions=[], output=[]) +- Exchange SinglePartition +- *HashAggregate(keys=[], functions=[], output=[]) +- BroadcastNestedLoopJoin BuildRight, LeftAnti, false :- Scan ExistingRDD[] +- BroadcastExchange IdentityBroadcastMode +- Scan ExistingRDD[] This arguably means a bug. But my guess is liking the logic of comparing NULL = NULL, should it return true or false, causing this kind of confusion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930922#comment-15930922 ] Jeff Zhang edited comment on SPARK-20001 at 3/18/17 12:17 AM: -- Thanks [~dansanduleac] It looks like we are doing similar things, recently I made some improvements in SPARK-13587, you can check my doc if you are interested. https://docs.google.com/document/d/1KB9RYW8_bSeOzwVqZFc_zy_vXqqqctwrU5TROP_16Ds/edit And maybe we can combine our approach to make it accepted by community asap, VirtualEnv is a pretty important feature for pyspark IMOH. was (Author: zjffdu): Thanks [~dansanduleac] It looks like we are doing similar things, recently I made some improvements in SPARK-13587, you can check my doc if you are interested. https://docs.google.com/document/d/1KB9RYW8_bSeOzwVqZFc_zy_vXqqqctwrU5TROP_16Ds/edit > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached pull request on palantir/spark for additional > details: https://github.com/palantir/spark/pull/115 > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930922#comment-15930922 ] Jeff Zhang edited comment on SPARK-20001 at 3/18/17 12:14 AM: -- Thanks [~dansanduleac] It looks like we are doing similar things, recently I made some improvements in SPARK-13587, you can check my doc if you are interested. https://docs.google.com/document/d/1KB9RYW8_bSeOzwVqZFc_zy_vXqqqctwrU5TROP_16Ds/edit was (Author: zjffdu): Thanks [~dansanduleac] It looks like we are do similar things, recently I made some improvements in SPARK-13587, you can check my doc if you are interested. https://docs.google.com/document/d/1KB9RYW8_bSeOzwVqZFc_zy_vXqqqctwrU5TROP_16Ds/edit > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached pull request on palantir/spark for additional > details: https://github.com/palantir/spark/pull/115 > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-20001: --- Comment: was deleted (was: Thanks [~dansanduleac] It looks like we are do similar things, recently I made some improvements in SPARK-13587, you can check my doc if you are interested. https://docs.google.com/document/d/1KB9RYW8_bSeOzwVqZFc_zy_vXqqqctwrU5TROP_16Ds/edit) > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached pull request on palantir/spark for additional > details: https://github.com/palantir/spark/pull/115 > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930924#comment-15930924 ] Jeff Zhang commented on SPARK-20001: Thanks [~dansanduleac] It looks like we are do similar things, recently I made some improvements in SPARK-13587, you can check my doc if you are interested. https://docs.google.com/document/d/1KB9RYW8_bSeOzwVqZFc_zy_vXqqqctwrU5TROP_16Ds/edit > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached pull request on palantir/spark for additional > details: https://github.com/palantir/spark/pull/115 > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930922#comment-15930922 ] Jeff Zhang commented on SPARK-20001: Thanks [~dansanduleac] It looks like we are do similar things, recently I made some improvements in SPARK-13587, you can check my doc if you are interested. https://docs.google.com/document/d/1KB9RYW8_bSeOzwVqZFc_zy_vXqqqctwrU5TROP_16Ds/edit > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached pull request on palantir/spark for additional > details: https://github.com/palantir/spark/pull/115 > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-18890) Do all task serialization in CoarseGrainedExecutorBackend thread (rather than TaskSchedulerImpl)
[ https://issues.apache.org/jira/browse/SPARK-18890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-18890. Resolution: Invalid I closed this because, as [~imranr] pointed out on the PR, these already happen in the same thread. [~witgo], can you change your PR to reference SPARK-19486, which describes the behavior you implemented? > Do all task serialization in CoarseGrainedExecutorBackend thread (rather than > TaskSchedulerImpl) > > > Key: SPARK-18890 > URL: https://issues.apache.org/jira/browse/SPARK-18890 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 2.1.0 >Reporter: Kay Ousterhout >Priority: Minor > > As part of benchmarking this change: > https://github.com/apache/spark/pull/15505 and alternatives, [~shivaram] and > I found that moving task serialization from TaskSetManager (which happens as > part of the TaskSchedulerImpl's thread) to CoarseGranedSchedulerBackend leads > to approximately a 10% reduction in job runtime for a job that counted 10,000 > partitions (that each had 1 int) using 20 machines. Similar performance > improvements were reported in the pull request linked above. This would > appear to be because the TaskSchedulerImpl thread is the bottleneck, so > moving serialization to CGSB reduces runtime. This change may *not* improve > runtime (and could potentially worsen runtime) in scenarios where the CGSB > thread is the bottleneck (e.g., if tasks are very large, so calling launch to > send the tasks to the executor blocks on the network). > One benefit of implementing this change is that it makes it easier to > parallelize the serialization of tasks (different tasks could be serialized > by different threads). Another benefit is that all of the serialization > occurs in the same place (currently, the Task is serialized in > TaskSetManager, and the TaskDescription is serialized in CGSB). > I'm not totally convinced we should fix this because it seems like there are > better ways of reducing the serialization time (e.g., by re-using a single > serialized object with the Task/jars/files and broadcasting it for each > stage) but I wanted to open this JIRA to document the discussion. > cc [~witgo] -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19565) After fetching failed, success of old attempt of stage should be taken as valid.
[ https://issues.apache.org/jira/browse/SPARK-19565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930910#comment-15930910 ] Kay Ousterhout commented on SPARK-19565: [~jinxing6...@126.com] I closed this because it looks like a duplicate with the word you did for SPARK-19263. Feel free to re-open if I've misunderstood. > After fetching failed, success of old attempt of stage should be taken as > valid. > > > Key: SPARK-19565 > URL: https://issues.apache.org/jira/browse/SPARK-19565 > Project: Spark > Issue Type: Test > Components: Scheduler >Affects Versions: 2.1.0 >Reporter: jin xing > > This is related to SPARK-19263. > When fetch failed, stage will be resubmitted. There can be running tasks from > both old and new stage attempts. Success of tasks from old stage attempt > should be taken as valid and partitionId should be removed from stage's > pendingPartitions accordingly. When pending partitions is empty, downstream > stage can be scheduled, even though there's still running tasks in the > active(new) stage attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19565) After fetching failed, success of old attempt of stage should be taken as valid.
[ https://issues.apache.org/jira/browse/SPARK-19565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-19565. Resolution: Duplicate > After fetching failed, success of old attempt of stage should be taken as > valid. > > > Key: SPARK-19565 > URL: https://issues.apache.org/jira/browse/SPARK-19565 > Project: Spark > Issue Type: Test > Components: Scheduler >Affects Versions: 2.1.0 >Reporter: jin xing > > This is related to SPARK-19263. > When fetch failed, stage will be resubmitted. There can be running tasks from > both old and new stage attempts. Success of tasks from old stage attempt > should be taken as valid and partitionId should be removed from stage's > pendingPartitions accordingly. When pending partitions is empty, downstream > stage can be scheduled, even though there's still running tasks in the > active(new) stage attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19755) Blacklist is always active for MesosCoarseGrainedSchedulerBackend. As result - scheduler cannot create an executor after some time.
[ https://issues.apache.org/jira/browse/SPARK-19755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930909#comment-15930909 ] Kay Ousterhout commented on SPARK-19755: I'm closing this because the configs you're proposing adding already exist: spark.blacklist.enabled already exists to turn of all blacklisting (this is false by default, so the fact that you're seeing blacklisting behavior means that your configuration enables blacklisting), and spark.blacklist.maxFailedTaskPerExecutor is the other thing you proposed adding. All of the blacklisting parameters are listed here: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/config/package.scala#L101 Feel free to re-open this if I've misunderstood and the existing configs don't address the issues you're seeing! > Blacklist is always active for MesosCoarseGrainedSchedulerBackend. As result > - scheduler cannot create an executor after some time. > --- > > Key: SPARK-19755 > URL: https://issues.apache.org/jira/browse/SPARK-19755 > Project: Spark > Issue Type: Bug > Components: Mesos, Scheduler >Affects Versions: 2.1.0 > Environment: mesos, marathon, docker - driver and executors are > dockerized. >Reporter: Timur Abakumov > > When for some reason task fails - MesosCoarseGrainedSchedulerBackend > increased failure counter for a slave where that task was running. > When counter is >=2 (MAX_SLAVE_FAILURES) mesos slave is excluded. > Over time scheduler cannot create a new executor - every slave is is in the > blacklist. Task failure not necessary related to host health- especially for > long running stream apps. > If accepted as a bug: possible solution is to use: spark.blacklist.enabled to > make that functionality optional and if it make sense MAX_SLAVE_FAILURES > also can be configurable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19755) Blacklist is always active for MesosCoarseGrainedSchedulerBackend. As result - scheduler cannot create an executor after some time.
[ https://issues.apache.org/jira/browse/SPARK-19755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-19755. Resolution: Not A Problem > Blacklist is always active for MesosCoarseGrainedSchedulerBackend. As result > - scheduler cannot create an executor after some time. > --- > > Key: SPARK-19755 > URL: https://issues.apache.org/jira/browse/SPARK-19755 > Project: Spark > Issue Type: Bug > Components: Mesos, Scheduler >Affects Versions: 2.1.0 > Environment: mesos, marathon, docker - driver and executors are > dockerized. >Reporter: Timur Abakumov > > When for some reason task fails - MesosCoarseGrainedSchedulerBackend > increased failure counter for a slave where that task was running. > When counter is >=2 (MAX_SLAVE_FAILURES) mesos slave is excluded. > Over time scheduler cannot create a new executor - every slave is is in the > blacklist. Task failure not necessary related to host health- especially for > long running stream apps. > If accepted as a bug: possible solution is to use: spark.blacklist.enabled to > make that functionality optional and if it make sense MAX_SLAVE_FAILURES > also can be configurable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19873) If the user changes the number of shuffle partitions between batches, Streaming aggregation will fail.
[ https://issues.apache.org/jira/browse/SPARK-19873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19873. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17216 [https://github.com/apache/spark/pull/17216] > If the user changes the number of shuffle partitions between batches, > Streaming aggregation will fail. > -- > > Key: SPARK-19873 > URL: https://issues.apache.org/jira/browse/SPARK-19873 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.1.0 >Reporter: Kunal Khamar > Fix For: 2.2.0 > > > If the user changes the shuffle partition number between batches, Streaming > aggregation will fail. > Here are some possible cases: > - Change "spark.sql.shuffle.partitions" > - Use "repartition" and change the partition number in codes > - RangePartitioner doesn't generate deterministic partitions. Right now it's > safe as we disallow sort before aggregation. Not sure if we will add some > operators using RangePartitioner in future. > Fix: > Record # shuffle partitions in offset log and enforce in next batch -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19967) Add from_json APIs to SQL
[ https://issues.apache.org/jira/browse/SPARK-19967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-19967. - Resolution: Fixed Fix Version/s: 2.2.0 > Add from_json APIs to SQL > - > > Key: SPARK-19967 > URL: https://issues.apache.org/jira/browse/SPARK-19967 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.1.0 >Reporter: Xiao Li >Assignee: Takeshi Yamamuro > Fix For: 2.2.0 > > > The method "from_json" is a useful method in turning a string column into a > nested StructType with a user specified schema. The schema should be > specified in the DDL format -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-19967) Add from_json APIs to SQL
[ https://issues.apache.org/jira/browse/SPARK-19967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-19967: --- Assignee: Takeshi Yamamuro > Add from_json APIs to SQL > - > > Key: SPARK-19967 > URL: https://issues.apache.org/jira/browse/SPARK-19967 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.1.0 >Reporter: Xiao Li >Assignee: Takeshi Yamamuro > Fix For: 2.2.0 > > > The method "from_json" is a useful method in turning a string column into a > nested StructType with a user specified schema. The schema should be > specified in the DDL format -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19967) Add from_json APIs to SQL
[ https://issues.apache.org/jira/browse/SPARK-19967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930796#comment-15930796 ] Xiao Li commented on SPARK-19967: - Resolved in the PR https://github.com/apache/spark/commit/7de66bae58733595cb88ec899640f7acf734d5c4 > Add from_json APIs to SQL > - > > Key: SPARK-19967 > URL: https://issues.apache.org/jira/browse/SPARK-19967 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.1.0 >Reporter: Xiao Li > Fix For: 2.2.0 > > > The method "from_json" is a useful method in turning a string column into a > nested StructType with a user specified schema. The schema should be > specified in the DDL format -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20007) Make SparkR apply() functions robust to workers that return empty data.frame
Hossein Falaki created SPARK-20007: -- Summary: Make SparkR apply() functions robust to workers that return empty data.frame Key: SPARK-20007 URL: https://issues.apache.org/jira/browse/SPARK-20007 Project: Spark Issue Type: Bug Components: SparkR Affects Versions: 2.2.0 Reporter: Hossein Falaki When using {{gapply()}} (or other members of {{apply()}} family) with a schema, Spark will try to parse data returned form the R process on each worker as Spark DataFrame Rows based on the schema. In this case our provided schema suggests that we have six column. When an R worker returns results to JVM, SparkSQL will try to access its columns one by one and cast them to proper types. If R worker returns nothing, JVM will throw {{ArrayIndexOutOfBoundsException}} exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-18847) PageRank gives incorrect results for graphs with sinks
[ https://issues.apache.org/jira/browse/SPARK-18847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18847. - Resolution: Fixed Assignee: Andrew Ray Fix Version/s: 2.2.0 > PageRank gives incorrect results for graphs with sinks > -- > > Key: SPARK-18847 > URL: https://issues.apache.org/jira/browse/SPARK-18847 > Project: Spark > Issue Type: Bug > Components: GraphX >Affects Versions: 1.0.2, 1.1.1, 1.2.2, 1.3.1, 1.4.1, 1.5.2, 1.6.3, 2.0.2 >Reporter: Andrew Ray >Assignee: Andrew Ray > Fix For: 2.2.0 > > > Sink vertices (those with no outgoing edges) should evenly distribute their > rank to the entire graph but in the current implementation it is just lost. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-16878) YARN - topology information
[ https://issues.apache.org/jira/browse/SPARK-16878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shubham Chopra closed SPARK-16878. -- Resolution: Unresolved Withdrawing issue. As a stop-gap, a file based topology mapper can be used where needed. > YARN - topology information > > > Key: SPARK-16878 > URL: https://issues.apache.org/jira/browse/SPARK-16878 > Project: Spark > Issue Type: Sub-task > Components: YARN >Reporter: Shubham Chopra > > Block replication strategies need topology information for ideal block > placements. This information is available in resource managers and/or can be > provided separately through scripts/services/classes, as in the case of HDFS. > This jira focuses on enhancing spark-yarn package to suitably extract > topology information from YARN. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20006) Separate threshold for broadcast and shuffled hash join
[ https://issues.apache.org/jira/browse/SPARK-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-20006: --- Description: Currently both canBroadcast and canBuildLocalHashMap use the same configuration: AUTO_BROADCASTJOIN_THRESHOLD. But the memory model may be different. For broadcast, currently the hash map is always build on heap. For shuffledHashJoin, the hash map may be build on heap(longHash), or off heap(other map if off heap is enabled). The same configuration makes the configuration hard to tune (how to allocate memory onheap/offheap). Propose to use different configuration. Please comments whether it is reasonable. was: Currently both canBroadcast and canBuildLocalHashMap use the same configuration: AUTO_BROADCASTJOIN_THRESHOLD. But the memory model may be different. For broadcast, currently the hash map is always build on heap. For shuffledHashJoin, the hash map may be build on heap(longHash), or off heap(other map if off heap is enabled). The same configuration makes the configuration hard to tune (how to allocate memory onheap/offheap). Propose to use different configuration. > Separate threshold for broadcast and shuffled hash join > --- > > Key: SPARK-20006 > URL: https://issues.apache.org/jira/browse/SPARK-20006 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Zhan Zhang >Priority: Minor > > Currently both canBroadcast and canBuildLocalHashMap use the same > configuration: AUTO_BROADCASTJOIN_THRESHOLD. > But the memory model may be different. For broadcast, currently the hash map > is always build on heap. For shuffledHashJoin, the hash map may be build on > heap(longHash), or off heap(other map if off heap is enabled). The same > configuration makes the configuration hard to tune (how to allocate memory > onheap/offheap). Propose to use different configuration. Please comments > whether it is reasonable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20006) Separate threshold for broadcast and shuffled hash join
Zhan Zhang created SPARK-20006: -- Summary: Separate threshold for broadcast and shuffled hash join Key: SPARK-20006 URL: https://issues.apache.org/jira/browse/SPARK-20006 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.1.0 Reporter: Zhan Zhang Priority: Minor Currently both canBroadcast and canBuildLocalHashMap use the same configuration: AUTO_BROADCASTJOIN_THRESHOLD. But the memory model may be different. For broadcast, currently the hash map is always build on heap. For shuffledHashJoin, the hash map may be build on heap(longHash), or off heap(other map if off heap is enabled). The same configuration makes the configuration hard to tune (how to allocate memory onheap/offheap). Propose to use different configuration. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20005) There is no "Newline" in UI in describtion
[ https://issues.apache.org/jira/browse/SPARK-20005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egor Pahomov updated SPARK-20005: - Description: There is no "newline" in UI in describtion: https://ibb.co/bLp2yv (was: Just see the attachment) > There is no "Newline" in UI in describtion > --- > > Key: SPARK-20005 > URL: https://issues.apache.org/jira/browse/SPARK-20005 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.1.0 >Reporter: Egor Pahomov >Priority: Minor > > There is no "newline" in UI in describtion: https://ibb.co/bLp2yv -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20005) There is no "Newline" in UI in describtion
Egor Pahomov created SPARK-20005: Summary: There is no "Newline" in UI in describtion Key: SPARK-20005 URL: https://issues.apache.org/jira/browse/SPARK-20005 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 2.1.0 Reporter: Egor Pahomov Priority: Minor Just see the attachment -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20004) Spark thrift server ovewrites spark.app.name
[ https://issues.apache.org/jira/browse/SPARK-20004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egor Pahomov updated SPARK-20004: - Summary: Spark thrift server ovewrites spark.app.name (was: Spark thrift server ovverides spark.app.name) > Spark thrift server ovewrites spark.app.name > > > Key: SPARK-20004 > URL: https://issues.apache.org/jira/browse/SPARK-20004 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Egor Pahomov >Priority: Minor > > {code} > export SPARK_YARN_APP_NAME="ODBC server $host" > /spark/sbin/start-thriftserver.sh --conf spark.yarn.queue=spark.client.$host > --conf spark.app.name="ODBC server $host" > {code} > And spark-defaults.conf contains: > {code} > spark.app.name "ODBC server spark01" > {code} > Still name in yarn is "Thrift JDBC/ODBC Server" -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20004) Spark thrift server ovverides spark.app.name
Egor Pahomov created SPARK-20004: Summary: Spark thrift server ovverides spark.app.name Key: SPARK-20004 URL: https://issues.apache.org/jira/browse/SPARK-20004 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.1.0 Reporter: Egor Pahomov Priority: Minor {code} export SPARK_YARN_APP_NAME="ODBC server $host" /spark/sbin/start-thriftserver.sh --conf spark.yarn.queue=spark.client.$host --conf spark.app.name="ODBC server $host" {code} And spark-defaults.conf contains: {code} spark.app.name "ODBC server spark01" {code} Still name in yarn is "Thrift JDBC/ODBC Server" -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20003) FPGrowthModel setMinConfidence should affect rules generation and transform
[ https://issues.apache.org/jira/browse/SPARK-20003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20003: --- Description: I was doing some test and find the issue. FPGrowthModel setMinConfidence should affect rules generation and transform. Currently associationRules in FPGrowthModel is a lazy val and setMinConfidence in FPGrowthModel has no impact once associationRules got computed . was: FPGrowthModel setMinConfidence should affect rules generation and transform. Currently associationRules in FPGrowthModel is a lazy val and setMinConfidence in FPGrowthModel has no impact once associationRules got computed . > FPGrowthModel setMinConfidence should affect rules generation and transform > --- > > Key: SPARK-20003 > URL: https://issues.apache.org/jira/browse/SPARK-20003 > Project: Spark > Issue Type: Bug > Components: ML >Affects Versions: 2.2.0 >Reporter: yuhao yang >Priority: Minor > > I was doing some test and find the issue. FPGrowthModel setMinConfidence > should affect rules generation and transform. > Currently associationRules in FPGrowthModel is a lazy val and > setMinConfidence in FPGrowthModel has no impact once associationRules got > computed . -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20003) FPGrowthModel setMinConfidence should affect rules generation and transform
yuhao yang created SPARK-20003: -- Summary: FPGrowthModel setMinConfidence should affect rules generation and transform Key: SPARK-20003 URL: https://issues.apache.org/jira/browse/SPARK-20003 Project: Spark Issue Type: Bug Components: ML Affects Versions: 2.2.0 Reporter: yuhao yang Priority: Minor FPGrowthModel setMinConfidence should affect rules generation and transform. Currently associationRules in FPGrowthModel is a lazy val and setMinConfidence in FPGrowthModel has no impact once associationRules got computed . -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19941) Spark should not schedule tasks on executors on decommissioning YARN nodes
[ https://issues.apache.org/jira/browse/SPARK-19941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930433#comment-15930433 ] Karthik Palaniappan commented on SPARK-19941: - Yeah, I could have been more clear. The application *should* continue, but the driver should drain executors *on decommissioning nodes* similar to how YARN is draining the NMs. All other executors should continue running. > Spark should not schedule tasks on executors on decommissioning YARN nodes > -- > > Key: SPARK-19941 > URL: https://issues.apache.org/jira/browse/SPARK-19941 > Project: Spark > Issue Type: Improvement > Components: Scheduler, YARN >Affects Versions: 2.1.0 > Environment: Hadoop 2.8.0-rc1 >Reporter: Karthik Palaniappan > > Hadoop 2.8 added a mechanism to gracefully decommission Node Managers in > YARN: https://issues.apache.org/jira/browse/YARN-914 > Essentially you can mark nodes to be decommissioned, and let them a) finish > work in progress and b) finish serving shuffle data. But no new work will be > scheduled on the node. > Spark should respect when NMs are set to decommissioned, and similarly > decommission executors on those nodes by not scheduling any more tasks on > them. > It looks like in the future YARN may inform the app master when containers > will be killed: https://issues.apache.org/jira/browse/YARN-3784. However, I > don't think Spark should schedule based on a timeout. We should gracefully > decommission the executor as fast as possible (which is the spirit of > YARN-914). The app master can query the RM for NM statuses (if it doesn't > already have them) and stop scheduling on executors on NMs that are > decommissioning. > Stretch feature: The timeout may be useful in determining whether running > further tasks on the executor is even helpful. Spark may be able to tell that > shuffle data will not be consumed by the time the node is decommissioned, so > it is not worth computing. The executor can be killed immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-19941) Spark should not schedule tasks on executors on decommissioning YARN nodes
[ https://issues.apache.org/jira/browse/SPARK-19941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930433#comment-15930433 ] Karthik Palaniappan edited comment on SPARK-19941 at 3/17/17 6:28 PM: -- Yeah, I could have been more clear. The application *should* continue, but the driver should drain executors *on decommissioning nodes* similar to how YARN is draining the NMs. All other executors can continue to have tasks scheduled on them. was (Author: karthik palaniappan): Yeah, I could have been more clear. The application *should* continue, but the driver should drain executors *on decommissioning nodes* similar to how YARN is draining the NMs. All other executors should continue running. > Spark should not schedule tasks on executors on decommissioning YARN nodes > -- > > Key: SPARK-19941 > URL: https://issues.apache.org/jira/browse/SPARK-19941 > Project: Spark > Issue Type: Improvement > Components: Scheduler, YARN >Affects Versions: 2.1.0 > Environment: Hadoop 2.8.0-rc1 >Reporter: Karthik Palaniappan > > Hadoop 2.8 added a mechanism to gracefully decommission Node Managers in > YARN: https://issues.apache.org/jira/browse/YARN-914 > Essentially you can mark nodes to be decommissioned, and let them a) finish > work in progress and b) finish serving shuffle data. But no new work will be > scheduled on the node. > Spark should respect when NMs are set to decommissioned, and similarly > decommission executors on those nodes by not scheduling any more tasks on > them. > It looks like in the future YARN may inform the app master when containers > will be killed: https://issues.apache.org/jira/browse/YARN-3784. However, I > don't think Spark should schedule based on a timeout. We should gracefully > decommission the executor as fast as possible (which is the spirit of > YARN-914). The app master can query the RM for NM statuses (if it doesn't > already have them) and stop scheduling on executors on NMs that are > decommissioning. > Stretch feature: The timeout may be useful in determining whether running > further tasks on the executor is even helpful. Spark may be able to tell that > shuffle data will not be consumed by the time the node is decommissioned, so > it is not worth computing. The executor can be killed immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-20001: --- Description: Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. You have to specify outright what packages and channels to "bootstrap" the environment with. However, SparkContext (as well as JavaSparkContext & the pyspark version) are expanded to support addCondaPackage and addCondaChannel. Rationale is: * you might want to add more packages once you're already running in the driver * you might want to add a channel which requires some token for authentication, which you don't yet have access to until the module is already running This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached pull request on palantir/spark for additional details: https://github.com/palantir/spark/pull/115 As for tests, there is a local python test, as well as yarn client & cluster-mode tests, which ensure that a newly installed package is visible from both the driver and the executor. was: Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. You have to specify outright what packages and channels to "bootstrap" the environment with. However, SparkContext (as well as JavaSparkContext & the pyspark version) are expanded to support addCondaPackage and addCondaChannel. Rationale is: * you might want to add more packages once you're already running in the driver * you might want to add a channel which requires some token for authentication, which you don't yet have access to until the module is already running This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached issue on palantir/spark for additional details: https://github.com/palantir/spark/pull/115 As for tests, there is a local python test, as well as yarn client & cluster-mode tests, which ensure that a newly installed package is visible from both the driver and the executor. > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached pull request on palantir/spark for additional > details: https://github.com/palantir/spark/pull/115 > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-20001: --- Description: Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. You have to specify outright what packages and channels to "bootstrap" the environment with. However, SparkContext (as well as JavaSparkContext & the pyspark version) are expanded to support addCondaPackage and addCondaChannel. Rationale is: * you might want to add more packages once you're already running in the driver * you might want to add a channel which requires some token for authentication, which you don't yet have access to until the module is already running This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached issue on palantir/spark for additional details: https://github.com/palantir/spark/pull/115 As for tests, there is a local python test, as well as yarn client & cluster-mode tests, which ensure that a newly installed package is visible from both the driver and the executor. was: Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. You have to specify outright what packages and channels to "bootstrap" the environment with. However, SparkContext (as well as JavaSparkContext & the pyspark version) are expanded to support addCondaPackage and addCondaChannel. Rationale is: * you might want to add more packages once you're already running in the driver * you might want to add a channel which requires some token for authentication, which you don't yet have access to until the module is already running This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached issue on palantir/spark for additional details. As for tests, there is a local python test, as well as yarn client & cluster-mode tests, which ensure that a newly installed package is visible from both the driver and the executor. > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached issue on palantir/spark for additional details: > https://github.com/palantir/spark/pull/115 > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail:
[jira] [Updated] (SPARK-18971) Netty issue may cause the shuffle client hang
[ https://issues.apache.org/jira/browse/SPARK-18971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18971: - Description: Check https://github.com/netty/netty/issues/6153 for details You should be able to see the following similar stack track in the executor thread dump. {code} "shuffle-client-7-4" daemon prio=5 tid=97 RUNNABLE at io.netty.util.Recycler$Stack.scavengeSome(Recycler.java:504) at io.netty.util.Recycler$Stack.scavenge(Recycler.java:454) at io.netty.util.Recycler$Stack.pop(Recycler.java:435) at io.netty.util.Recycler.get(Recycler.java:144) at io.netty.buffer.PooledUnsafeDirectByteBuf.newInstance(PooledUnsafeDirectByteBuf.java:39) at io.netty.buffer.PoolArena$DirectArena.newByteBuf(PoolArena.java:727) at io.netty.buffer.PoolArena.allocate(PoolArena.java:140) at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:271) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:177) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:168) at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:129) at io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) at java.lang.Thread.run(Thread.java:745) {code} was:Check https://github.com/netty/netty/issues/6153 for details > Netty issue may cause the shuffle client hang > - > > Key: SPARK-18971 > URL: https://issues.apache.org/jira/browse/SPARK-18971 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Shixiong Zhu >Assignee: Shixiong Zhu >Priority: Minor > Fix For: 2.2.0 > > > Check https://github.com/netty/netty/issues/6153 for details > You should be able to see the following similar stack track in the executor > thread dump. > {code} > "shuffle-client-7-4" daemon prio=5 tid=97 RUNNABLE > at io.netty.util.Recycler$Stack.scavengeSome(Recycler.java:504) > at io.netty.util.Recycler$Stack.scavenge(Recycler.java:454) > at io.netty.util.Recycler$Stack.pop(Recycler.java:435) > at io.netty.util.Recycler.get(Recycler.java:144) > at > io.netty.buffer.PooledUnsafeDirectByteBuf.newInstance(PooledUnsafeDirectByteBuf.java:39) > at > io.netty.buffer.PoolArena$DirectArena.newByteBuf(PoolArena.java:727) > at io.netty.buffer.PoolArena.allocate(PoolArena.java:140) > at > io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:271) > at > io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:177) > at > io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:168) > at > io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:129) > at > io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140) > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19986) Make pyspark.streaming.tests.CheckpointTests more stable
[ https://issues.apache.org/jira/browse/SPARK-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-19986. --- Resolution: Fixed Fix Version/s: 2.2.0 2.0.3 2.1.1 Issue resolved by pull request 17323 [https://github.com/apache/spark/pull/17323] > Make pyspark.streaming.tests.CheckpointTests more stable > > > Key: SPARK-19986 > URL: https://issues.apache.org/jira/browse/SPARK-19986 > Project: Spark > Issue Type: Improvement > Components: Tests >Affects Versions: 2.1.0 >Reporter: Shixiong Zhu > Fix For: 2.1.1, 2.0.3, 2.2.0 > > > Sometimes, CheckpointTests will hang because the streaming jobs are too slow > and cannot catch up. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20002) Add support for unions between streaming and batch datasets
Leon Pham created SPARK-20002: - Summary: Add support for unions between streaming and batch datasets Key: SPARK-20002 URL: https://issues.apache.org/jira/browse/SPARK-20002 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.0.2 Reporter: Leon Pham Currently unions between streaming datasets and batch datasets are not supported. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19721) Good error message for version mismatch in log files
[ https://issues.apache.org/jira/browse/SPARK-19721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19721. -- Resolution: Fixed Fix Version/s: 2.1.1 > Good error message for version mismatch in log files > > > Key: SPARK-19721 > URL: https://issues.apache.org/jira/browse/SPARK-19721 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.1.0 >Reporter: Michael Armbrust >Assignee: Liwei Lin >Priority: Blocker > Fix For: 2.1.1, 2.2.0 > > > There are several places where we write out version identifiers in various > logs for structured streaming (usually {{v1}}). However, in the places where > we check for this, we throw a confusing error message. Instead, we should do > the following: > - Find all of the places where we do this kind of check. > - for {{vN}} where {{n>1}} say "UnsupportedLogFormat: The file {{path}} was > produced by a newer version of Spark and cannot be read by this version. > Please upgrade" > - for anything else throw an error saying the file is malformed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20001) Support PythonRunner executing inside a Conda env
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Sanduleac updated SPARK-20001: -- Description: Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. You have to specify outright what packages and channels to "bootstrap" the environment with. However, SparkContext (as well as JavaSparkContext & the pyspark version) are expanded to support addCondaPackage and addCondaChannel. Rationale is: * you might want to add more packages once you're already running in the driver * you might want to add a channel which requires some token for authentication, which you don't yet have access to until the module is already running This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached issue on palantir/spark for additional details. As for tests, there is a local python test, as well as yarn client & cluster-mode tests, which ensure that a newly installed package is visible from both the driver and the executor. was: Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached issue on palantir/spark for additional details. > Support PythonRunner executing inside a Conda env > - > > Key: SPARK-20001 > URL: https://issues.apache.org/jira/browse/SPARK-20001 > Project: Spark > Issue Type: New Feature > Components: PySpark, Spark Core >Affects Versions: 2.2.0 >Reporter: Dan Sanduleac > Original Estimate: 168h > Remaining Estimate: 168h > > Similar to SPARK-13587, I'm trying to allow the user to configure a Conda > environment that PythonRunner will run from. > This change remembers theconda environment found on the driver and installs > the same packages on the executor side, only once per PythonWorkerFactory. > The list of requested conda packages are added to the PythonWorkerFactory > cache, so two collects using the same environment (incl packages) can re-use > the same running executors. > You have to specify outright what packages and channels to "bootstrap" the > environment with. > However, SparkContext (as well as JavaSparkContext & the pyspark version) are > expanded to support addCondaPackage and addCondaChannel. > Rationale is: > * you might want to add more packages once you're already running in the > driver > * you might want to add a channel which requires some token for > authentication, which you don't yet have access to until the module is > already running > This issue requires that the conda binary is already available on the driver > as well as executors, you just have to specify where it can be found. > Please see the attached issue on palantir/spark for additional details. > As for tests, there is a local python test, as well as yarn client & > cluster-mode tests, which ensure that a newly installed package is visible > from both the driver and the executor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20001) Support PythonRunner executing inside a Conda env
Dan Sanduleac created SPARK-20001: - Summary: Support PythonRunner executing inside a Conda env Key: SPARK-20001 URL: https://issues.apache.org/jira/browse/SPARK-20001 Project: Spark Issue Type: New Feature Components: PySpark, Spark Core Affects Versions: 2.2.0 Reporter: Dan Sanduleac Similar to SPARK-13587, I'm trying to allow the user to configure a Conda environment that PythonRunner will run from. This change remembers theconda environment found on the driver and installs the same packages on the executor side, only once per PythonWorkerFactory. The list of requested conda packages are added to the PythonWorkerFactory cache, so two collects using the same environment (incl packages) can re-use the same running executors. This issue requires that the conda binary is already available on the driver as well as executors, you just have to specify where it can be found. Please see the attached issue on palantir/spark for additional details. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19997) proxy-user failed connecting to a kerberos configured metastore
[ https://issues.apache.org/jira/browse/SPARK-19997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19997. Resolution: Duplicate > proxy-user failed connecting to a kerberos configured metastore > --- > > Key: SPARK-19997 > URL: https://issues.apache.org/jira/browse/SPARK-19997 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Kent Yao > > Start runing spark-sql via proxy-user on a kerberos configured hadoop cluster > and metastore > {code} > bin/spark-sql --proxy-user hzyaoqin > {code} > Failed with the following err: > {code:java} > 17/03/17 16:05:41 INFO hive.metastore: Trying to connect to metastore with > URI thrift://xxx:9083 > 17/03/17 16:05:41 ERROR transport.TSaslTransport: SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) > at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) > at > org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) > at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) > at > org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) > at > org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) > at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) > at > org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) > at > org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:192) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) > at > org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366) > at > org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270) > at > org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at >
[jira] [Commented] (SPARK-19982) JavaDatasetSuite.testJavaBeanEncoder sometimes fails with "Unable to generate an encoder for inner class"
[ https://issues.apache.org/jira/browse/SPARK-19982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930094#comment-15930094 ] Jose Soltren commented on SPARK-19982: -- Pasting in the exception from the link in case the link dies: Unable to generate an encoder for inner class `test.org.apache.spark.sql.JavaDatasetSuite$SimpleJavaBean` without access to the scope that this class was defined in. Try moving this class out of its parent class.; org.apache.spark.sql.AnalysisException: Unable to generate an encoder for inner class `test.org.apache.spark.sql.JavaDatasetSuite$SimpleJavaBean` without access to the scope that this class was defined in. Try moving this class out of its parent class.; at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$2.applyOrElse(ExpressionEncoder.scala:264) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$2.applyOrElse(ExpressionEncoder.scala:260) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:242) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:248) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:248) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:265) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:305) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:248) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:233) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.resolve(ExpressionEncoder.scala:260) at org.apache.spark.sql.Dataset.(Dataset.scala:78) at org.apache.spark.sql.Dataset.(Dataset.scala:89) at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:507) at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:520) at test.org.apache.spark.sql.JavaDatasetSuite.testJavaBeanEncoder(JavaDatasetSuite.java:696) > JavaDatasetSuite.testJavaBeanEncoder sometimes fails with "Unable to generate > an encoder for inner class" > - > > Key: SPARK-19982 > URL: https://issues.apache.org/jira/browse/SPARK-19982 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.1.0 >Reporter: Jose Soltren > Labels: flaky-test > > JavaDatasetSuite.testJavaBeanEncoder fails sporadically with the error below: > Unable to generate an encoder for inner class > `test.org.apache.spark.sql.JavaDatasetSuite$SimpleJavaBean` without access to > the scope that this class was defined in. Try moving this class out of its > parent class. > From https://spark-tests.appspot.com/test-logs/35475788 > [~vanzin] looked into this back in October and reported: > I ran this test in a loop (both alone and with the rest of the spark-sql > tests) and never got a failure. I even used the same JDK as Jenkins > (1.7.0_51). > Also looked at the code and nothing seems wrong. The errors is when an entry > with the parent class name is missing from the map kept in OuterScopes.scala, > but the test populates that map in its first line. So it doesn't look like a > race nor some issue with weak references (the map uses weak values). > public void testJavaBeanEncoder() { > OuterScopes.addOuterScope(this); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (SPARK-13369) Number of consecutive fetch failures for a stage before the job is aborted should be configurable
[ https://issues.apache.org/jira/browse/SPARK-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930063#comment-15930063 ] Imran Rashid commented on SPARK-13369: -- Thanks for fixing this [~sitalke...@gmail.com]. I just noticed that earlier in this ticket, there was a discussion about the need to set this config for streaming. I don't believe that is true, the way this works it actually should be fine for occasional fetch failures in a long-lived streaming job. The maximum number of fetch failures is per-stage, and the count is reset when the stage is run successfully. Can you explain why you'd need to modify this config for a streaming job? (The large cluster case at Facebook makes sense to me, as we discussed on the pr, and I updated the jira description accordingly.) > Number of consecutive fetch failures for a stage before the job is aborted > should be configurable > -- > > Key: SPARK-13369 > URL: https://issues.apache.org/jira/browse/SPARK-13369 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Sital Kedia >Assignee: Sital Kedia >Priority: Minor > Fix For: 2.2.0 > > > The previously hardcoded max 4 retries per stage is not suitable for all > cluster configurations. Since spark retries a stage at the sign of the first > fetch failure, you can easily end up with many stage retries to discover all > the failures. In particular, two scenarios this value should change are (1) > if there are more than 4 executors per node; in that case, it may take 4 > retries to discover the problem with each executor on the node and (2) during > cluster maintenance on large clusters, where multiple machines are serviced > at once, but you also cannot afford total cluster downtime. By making this > value configurable, cluster managers can tune this value to something more > appropriate to their cluster configuration. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13369) Number of consecutive fetch failures for a stage before the job is aborted should be configurable
[ https://issues.apache.org/jira/browse/SPARK-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-13369: - Description: The previously hardcoded max 4 retries per stage is not suitable for all cluster configurations. Since spark retries a stage at the sign of the first fetch failure, you can easily end up with many stage retries to discover all the failures. In particular, two scenarios this value should change are (1) if there are more than 4 executors per node; in that case, it may take 4 retries to discover the problem with each executor on the node and (2) during cluster maintenance on large clusters, where multiple machines are serviced at once, but you also cannot afford total cluster downtime. By making this value configurable, cluster managers can tune this value to something more appropriate to their cluster configuration. (was: Currently it is hardcode inside code. We need to make it configurable because for long running jobs, the chances of fetch failures due to machine reboot is high and we need a configuration parameter to bump up that number. ) > Number of consecutive fetch failures for a stage before the job is aborted > should be configurable > -- > > Key: SPARK-13369 > URL: https://issues.apache.org/jira/browse/SPARK-13369 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Sital Kedia >Assignee: Sital Kedia >Priority: Minor > Fix For: 2.2.0 > > > The previously hardcoded max 4 retries per stage is not suitable for all > cluster configurations. Since spark retries a stage at the sign of the first > fetch failure, you can easily end up with many stage retries to discover all > the failures. In particular, two scenarios this value should change are (1) > if there are more than 4 executors per node; in that case, it may take 4 > retries to discover the problem with each executor on the node and (2) during > cluster maintenance on large clusters, where multiple machines are serviced > at once, but you also cannot afford total cluster downtime. By making this > value configurable, cluster managers can tune this value to something more > appropriate to their cluster configuration. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13369) Number of consecutive fetch failures for a stage before the job is aborted should be configurable
[ https://issues.apache.org/jira/browse/SPARK-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-13369: Assignee: Imran Rashid > Number of consecutive fetch failures for a stage before the job is aborted > should be configurable > -- > > Key: SPARK-13369 > URL: https://issues.apache.org/jira/browse/SPARK-13369 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Sital Kedia >Assignee: Imran Rashid >Priority: Minor > Fix For: 2.2.0 > > > Currently it is hardcode inside code. We need to make it configurable because > for long running jobs, the chances of fetch failures due to machine reboot is > high and we need a configuration parameter to bump up that number. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-13369) Number of consecutive fetch failures for a stage before the job is aborted should be configurable
[ https://issues.apache.org/jira/browse/SPARK-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-13369: Assignee: Sital Kedia (was: Imran Rashid) > Number of consecutive fetch failures for a stage before the job is aborted > should be configurable > -- > > Key: SPARK-13369 > URL: https://issues.apache.org/jira/browse/SPARK-13369 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Sital Kedia >Assignee: Sital Kedia >Priority: Minor > Fix For: 2.2.0 > > > Currently it is hardcode inside code. We need to make it configurable because > for long running jobs, the chances of fetch failures due to machine reboot is > high and we need a configuration parameter to bump up that number. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-13369) Number of consecutive fetch failures for a stage before the job is aborted should be configurable
[ https://issues.apache.org/jira/browse/SPARK-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-13369. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17307 [https://github.com/apache/spark/pull/17307] > Number of consecutive fetch failures for a stage before the job is aborted > should be configurable > -- > > Key: SPARK-13369 > URL: https://issues.apache.org/jira/browse/SPARK-13369 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Sital Kedia >Priority: Minor > Fix For: 2.2.0 > > > Currently it is hardcode inside code. We need to make it configurable because > for long running jobs, the chances of fetch failures due to machine reboot is > high and we need a configuration parameter to bump up that number. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19996) transfer spark-defaults.conf to spark-defaults.xml
[ https://issues.apache.org/jira/browse/SPARK-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19996. --- Resolution: Invalid > transfer spark-defaults.conf to spark-defaults.xml > -- > > Key: SPARK-19996 > URL: https://issues.apache.org/jira/browse/SPARK-19996 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: hanzhi > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19644) Memory leak in Spark Streaming
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929979#comment-15929979 ] Deenbandhu Agarwal commented on SPARK-19644: Any updates ?? > Memory leak in Spark Streaming > -- > > Key: SPARK-19644 > URL: https://issues.apache.org/jira/browse/SPARK-19644 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.0.2 > Environment: 3 AWS EC2 c3.xLarge > Number of cores - 3 > Number of executors 3 > Memory to each executor 2GB >Reporter: Deenbandhu Agarwal >Priority: Critical > Labels: memory_leak, performance > Attachments: Dominator_tree.png, heapdump.png, Path2GCRoot.png > > > I am using streaming on the production for some aggregation and fetching data > from cassandra and saving data back to cassandra. > I see a gradual increase in old generation heap capacity from 1161216 Bytes > to 1397760 Bytes over a period of six hours. > After 50 hours of processing instances of class > scala.collection.immutable.$colon$colon incresed to 12,811,793 which is a > huge number. > I think this is a clear case of memory leak -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20000) Spark Hive tests aborted due to lz4-java on ppc64le
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929965#comment-15929965 ] Sean Owen commented on SPARK-2: --- 2! > Spark Hive tests aborted due to lz4-java on ppc64le > --- > > Key: SPARK-2 > URL: https://issues.apache.org/jira/browse/SPARK-2 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 > Environment: Ubuntu 14.04 ppc64le > $ java -version > openjdk version "1.8.0_111" > OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14) > OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) >Reporter: Sonia Garudi > Labels: ppc64le > Attachments: hs_err_pid.log > > > The tests are getting aborted in Spark Hive project with the following error : > {code:borderStyle=solid} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x3fff94dbf114, pid=6160, tid=0x3fff6efef1a0 > # > # JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build > 1.8.0_111-8u111-b14-3~14.04.1-b14) > # Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-ppc64 > compressed oops) > # Problematic frame: > # V [libjvm.so+0x56f114] > {code} > In the thread log file, I found the following traces : > Event: 3669.042 Thread 0x3fff89976800 Exception 'java/lang/NoClassDefFoundError': Could not initialize class > net.jpountz.lz4.LZ4JNI> (0x00079fcda3b8) thrown at > [/build/openjdk-8-fVIxxI/openjdk-8-8u111-b14/src/hotspot/src/share/vm/oops/instanceKlass.cpp, > line 890] > This error is due to the lz4-java (version 1.3.0), which doesn’t have support > for ppc64le.PFA the thread log file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster
[ https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929762#comment-15929762 ] narendra maru commented on SPARK-19992: --- yes I am having same yarn jars and spark_home directory on all the node at the same location > spark-submit on deployment-mode cluster > --- > > Key: SPARK-19992 > URL: https://issues.apache.org/jira/browse/SPARK-19992 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 2.0.2 > Environment: spark version 2.0.2 > hadoop version 2.6.0 >Reporter: narendra maru > > spark version 2.0.2 > hadoop version 2.6.0 > spark -submit command > "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode > cluster --jars /home/ec2-user/jars/hgmongonew.jar, > /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" > after adding following in > 1 Spark-default.conf > spark.executor.extraJavaOptions -Dconfig.fuction.conf > spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory > spark.eventLog.enabled=true > 2yarn-site.xml > > yarn.application.classpath > > /usr/local/hadoop-2.6.0/etc/hadoop, > /usr/local/hadoop-2.6.0/, > /usr/local/hadoop-2.6.0/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/common/, > /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, > /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar > > > Error on log:- > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > Error on terminal:- > diagnostics: Application application_1489673977198_0002 failed 2 times due to > AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 > For more detailed output, check application tracking > page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, > click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-19996) transfer spark-defaults.conf to spark-defaults.xml
[ https://issues.apache.org/jira/browse/SPARK-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hanzhi updated SPARK-19996: --- Docs Text: support xml include,more convenient。 (was: support xml include function,more convenient。) > transfer spark-defaults.conf to spark-defaults.xml > -- > > Key: SPARK-19996 > URL: https://issues.apache.org/jira/browse/SPARK-19996 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: hanzhi > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20000) Spark Hive tests aborted due to lz4-java on ppc64le
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sonia Garudi updated SPARK-2: - Attachment: hs_err_pid.log > Spark Hive tests aborted due to lz4-java on ppc64le > --- > > Key: SPARK-2 > URL: https://issues.apache.org/jira/browse/SPARK-2 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.2.0 > Environment: Ubuntu 14.04 ppc64le > $ java -version > openjdk version "1.8.0_111" > OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14) > OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) >Reporter: Sonia Garudi > Labels: ppc64le > Attachments: hs_err_pid.log > > > The tests are getting aborted in Spark Hive project with the following error : > {code:borderStyle=solid} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x3fff94dbf114, pid=6160, tid=0x3fff6efef1a0 > # > # JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build > 1.8.0_111-8u111-b14-3~14.04.1-b14) > # Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-ppc64 > compressed oops) > # Problematic frame: > # V [libjvm.so+0x56f114] > {code} > In the thread log file, I found the following traces : > Event: 3669.042 Thread 0x3fff89976800 Exception 'java/lang/NoClassDefFoundError': Could not initialize class > net.jpountz.lz4.LZ4JNI> (0x00079fcda3b8) thrown at > [/build/openjdk-8-fVIxxI/openjdk-8-8u111-b14/src/hotspot/src/share/vm/oops/instanceKlass.cpp, > line 890] > This error is due to the lz4-java (version 1.3.0), which doesn’t have support > for ppc64le.PFA the thread log file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20000) Spark Hive tests aborted due to lz4-java on ppc64le
Sonia Garudi created SPARK-2: Summary: Spark Hive tests aborted due to lz4-java on ppc64le Key: SPARK-2 URL: https://issues.apache.org/jira/browse/SPARK-2 Project: Spark Issue Type: Bug Components: Tests Affects Versions: 2.2.0 Environment: Ubuntu 14.04 ppc64le $ java -version openjdk version "1.8.0_111" OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14) OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) Reporter: Sonia Garudi The tests are getting aborted in Spark Hive project with the following error : {code:borderStyle=solid} # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x3fff94dbf114, pid=6160, tid=0x3fff6efef1a0 # # JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build 1.8.0_111-8u111-b14-3~14.04.1-b14) # Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-ppc64 compressed oops) # Problematic frame: # V [libjvm.so+0x56f114] {code} In the thread log file, I found the following traces : Event: 3669.042 Thread 0x3fff89976800 Exception (0x00079fcda3b8) thrown at [/build/openjdk-8-fVIxxI/openjdk-8-8u111-b14/src/hotspot/src/share/vm/oops/instanceKlass.cpp, line 890] This error is due to the lz4-java (version 1.3.0), which doesn’t have support for ppc64le.PFA the thread log file. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-19997) proxy-user failed connecting to a kerberos configured metastore
[ https://issues.apache.org/jira/browse/SPARK-19997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-19997: - Description: Start runing spark-sql via proxy-user on a kerberos configured hadoop cluster and metastore {code} bin/spark-sql --proxy-user hzyaoqin {code} Failed with the following err: {code:java} 17/03/17 16:05:41 INFO hive.metastore: Trying to connect to metastore with URI thrift://xxx:9083 17/03/17 16:05:41 ERROR transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:192) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270) at org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:86) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at
[jira] [Created] (SPARK-19999) Test failures in Spark Core due to java.nio.Bits.unaligned()
Sonia Garudi created SPARK-1: Summary: Test failures in Spark Core due to java.nio.Bits.unaligned() Key: SPARK-1 URL: https://issues.apache.org/jira/browse/SPARK-1 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.1.0, 2.2.0 Environment: Ubuntu 14.04 ppc64le $ java -version openjdk version "1.8.0_111" OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14) OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) Reporter: Sonia Garudi There are multiple test failures seen in Spark Core project with the following error message : {code:borderStyle=solid} java.lang.IllegalArgumentException: requirement failed: No support for unaligned Unsafe. Set spark.memory.offHeap.enabled to false. {code} These errors occur due to java.nio.Bits.unaligned(), which does not return true for the ppc64le arch. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-19997) proxy-user failed connecting to a kerberos configured metastore
[ https://issues.apache.org/jira/browse/SPARK-19997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-19997: - Description: Start runing spark-sql via proxy-user on a kerberos configured hadoop cluster and metastore {code} bin/spark-sql --proxy-user hzyaoqin {code} Failed with the following err: {code:java} 17/03/17 16:05:41 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoop712.lt.163.org:9083 17/03/17 16:05:41 ERROR transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:192) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270) at org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:86) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at
[jira] [Created] (SPARK-19998) BlockRDD block not found Exception add RDD id info
jianran.tfh created SPARK-19998: --- Summary: BlockRDD block not found Exception add RDD id info Key: SPARK-19998 URL: https://issues.apache.org/jira/browse/SPARK-19998 Project: Spark Issue Type: Improvement Components: Block Manager Affects Versions: 2.1.0 Reporter: jianran.tfh Priority: Trivial "java.lang.Exception: Could not compute split, block $blockId not found" doesn't have the rdd id info, the "BlockManager: Removing RDD $id" has only the RDD id, so it couldn't find that the Exception's reason is the Removing; so it's better block not found Exception add RDD id info -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-19997) proxy-user failed connecting to a kerberos configured metastore
Kent Yao created SPARK-19997: Summary: proxy-user failed connecting to a kerberos configured metastore Key: SPARK-19997 URL: https://issues.apache.org/jira/browse/SPARK-19997 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.1.0 Reporter: Kent Yao Start runing spark-sql via proxy-user on a kerberos configured hadoop cluster and metastore {code:shell} bin/spark-sql --proxy-user hzyaoqin {code} Failed with the following err: {code:scala} 17/03/17 16:05:41 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoop712.lt.163.org:9083 17/03/17 16:05:41 ERROR transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:192) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270) at org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:86) at
[jira] [Commented] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929591#comment-15929591 ] Andrew Ash commented on SPARK-18278: As an update on this ticket: For those not already aware, work on native Spark integration with Kubernetes has been proceeding for the past several months in this repo https://github.com/apache-spark-on-k8s/spark in the {{branch-2.1-kubernetes}} branch, based off the 2.1.0 Apache release. We have an active core of about a half dozen contributors to the project with a wider group observing of about another dozen. Communication happens through the issues on the GitHub repo, a dedicated room in the Kubernetes Slack, and weekly video conferences hosted by the Kubernetes Big Data SIG. The full patch set is currently about 5500 lines, with about 500 of that as user/dev documentation. Infrastructure-wise, we have a cloud-hosted CI Jenkins instance set up donated by project members, which is running both unit tests and Kubernetes integration tests over the code. We recently entered a code freeze for our release branch and are preparing a first release to the wider community, which we plan to announce on the general Spark users list. It includes the completed "phase one" portion of the design doc shared a few months ago (https://docs.google.com/document/d/1_bBzOZ8rKiOSjQg78DXOA3ZBIo_KkDJjqxVuq0yXdew/edit#heading=h.fua3ml5mcolt), featuring cluster mode with static allocation of executors, submission of local resources, SSL throughout, and support for JVM languages (Java/Scala). After that release we'll be continuing to stabilize and improve the phase one feature set and move into a second phase of kubernetes work. It will likely be focused on support for dynamic allocation, though we haven't finalized planning for phase two yet. Working on the pluggable scheduler in SPARK-19700 may be included as well. Interested parties are of course welcome to watch the repo, join the weekly video conferences, give the code a shot, and contribute to the project! > Support native submission of spark jobs to a kubernetes cluster > --- > > Key: SPARK-18278 > URL: https://issues.apache.org/jira/browse/SPARK-18278 > Project: Spark > Issue Type: Umbrella > Components: Build, Deploy, Documentation, Scheduler, Spark Core >Reporter: Erik Erlandson > Attachments: SPARK-18278 - Spark on Kubernetes Design Proposal.pdf > > > A new Apache Spark sub-project that enables native support for submitting > Spark applications to a kubernetes cluster. The submitted application runs > in a driver executing on a kubernetes pod, and executors lifecycles are also > managed as pods. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-19882) Pivot with null as the pivot value throws NPE
[ https://issues.apache.org/jira/browse/SPARK-19882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19882: --- Assignee: Andrew Ray > Pivot with null as the pivot value throws NPE > - > > Key: SPARK-19882 > URL: https://issues.apache.org/jira/browse/SPARK-19882 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hyukjin Kwon >Assignee: Andrew Ray > Fix For: 2.2.0 > > > This seems a regression. > - Spark 1.6 > {code} > Seq(Tuple1(None), > Tuple1(Some(1))).toDF("a").groupBy($"a").pivot("a").count().show() > {code} > prints > {code} > +++---+ > | a|null| 1| > +++---+ > |null| 0| 0| > | 1| 0| 1| > +++---+ > {code} > - Current master > {code} > Seq(Tuple1(None), > Tuple1(Some(1))).toDF("a").groupBy($"a").pivot("a").count().show() > {code} > prints > {code} > java.lang.NullPointerException was thrown. > java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.aggregate.PivotFirst$$anonfun$4.apply(PivotFirst.scala:145) > at > org.apache.spark.sql.catalyst.expressions.aggregate.PivotFirst$$anonfun$4.apply(PivotFirst.scala:143) > at scala.collection.immutable.List.map(List.scala:273) > at > org.apache.spark.sql.catalyst.expressions.aggregate.PivotFirst.(PivotFirst.scala:143) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot$$anonfun$apply$7$$anonfun$24.apply(Analyzer.scala:509) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-19882) Pivot with null as the pivot value throws NPE
[ https://issues.apache.org/jira/browse/SPARK-19882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19882. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17226 [https://github.com/apache/spark/pull/17226] > Pivot with null as the pivot value throws NPE > - > > Key: SPARK-19882 > URL: https://issues.apache.org/jira/browse/SPARK-19882 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hyukjin Kwon > Fix For: 2.2.0 > > > This seems a regression. > - Spark 1.6 > {code} > Seq(Tuple1(None), > Tuple1(Some(1))).toDF("a").groupBy($"a").pivot("a").count().show() > {code} > prints > {code} > +++---+ > | a|null| 1| > +++---+ > |null| 0| 0| > | 1| 0| 1| > +++---+ > {code} > - Current master > {code} > Seq(Tuple1(None), > Tuple1(Some(1))).toDF("a").groupBy($"a").pivot("a").count().show() > {code} > prints > {code} > java.lang.NullPointerException was thrown. > java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.aggregate.PivotFirst$$anonfun$4.apply(PivotFirst.scala:145) > at > org.apache.spark.sql.catalyst.expressions.aggregate.PivotFirst$$anonfun$4.apply(PivotFirst.scala:143) > at scala.collection.immutable.List.map(List.scala:273) > at > org.apache.spark.sql.catalyst.expressions.aggregate.PivotFirst.(PivotFirst.scala:143) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot$$anonfun$apply$7$$anonfun$24.apply(Analyzer.scala:509) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-19927) SparkThriftServer2 can not get ''--hivevar" variables in spark 2.1
[ https://issues.apache.org/jira/browse/SPARK-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bruce xu updated SPARK-19927: - Description: suppose the content of file test1.sql: - USE ${hivevar:db_name}; - when execute command: bin/spark-sql -f /tmp/test.sql --hivevar db_name=offline the output is: Error: org.apache.spark.sql.catalyst.parser.ParseException: no viable alternative at input ''(line 1, pos 4) == SQL == use ^^^ (state=,code=0) - so the parameter --hivevar can not be read from CLI. the bug still appears with beeline command: bin/beeline -f /tmp/test2.sql --hivevar db_name=offline with test2.sql: !connect jdbc:hive2://localhost:1 test test USE ${hivevar:db_name}; -- was: suppose the content of test1.sql: - USE ${hivevar:db_name}; - when execute: bin/spark-sql -f /tmp/test.sql --hivevar db_name=offline the output is: Error: org.apache.spark.sql.catalyst.parser.ParseException: no viable alternative at input ''(line 1, pos 4) == SQL == use ^^^ (state=,code=0) - so hivevar can not be read from CLI. the bug still appears with beeline command: bin/beeline -f /tmp/test2.sql --hivevar db_name=offline with test2.sql: !connect jdbc:hive2://localhost:1 test test USE ${hivevar:db_name}; > SparkThriftServer2 can not get ''--hivevar" variables in spark 2.1 > -- > > Key: SPARK-19927 > URL: https://issues.apache.org/jira/browse/SPARK-19927 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.1, 2.1.0 > Environment: CentOS 6.5,spark 2.1 build with mvn -Pyarn -Phadoop-2.6 > -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -Dscala-2.11 >Reporter: bruce xu > > suppose the content of file test1.sql: > - > USE ${hivevar:db_name}; > - > > when execute command: bin/spark-sql -f /tmp/test.sql --hivevar > db_name=offline > the output is: > > Error: org.apache.spark.sql.catalyst.parser.ParseException: > no viable alternative at input ''(line 1, pos 4) > == SQL == > use > ^^^ (state=,code=0) > - > so the parameter --hivevar can not be read from CLI. > the bug still appears with beeline command: bin/beeline -f /tmp/test2.sql > --hivevar db_name=offline with test2.sql: > > !connect jdbc:hive2://localhost:1 test test > USE ${hivevar:db_name}; > -- -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19996) transfer spark-defaults.conf to spark-defaults.xml
[ https://issues.apache.org/jira/browse/SPARK-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929580#comment-15929580 ] Sean Owen commented on SPARK-19996: --- What is this about? The conf file isn't going to be switched to XML. > transfer spark-defaults.conf to spark-defaults.xml > -- > > Key: SPARK-19996 > URL: https://issues.apache.org/jira/browse/SPARK-19996 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: hanzhi > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-19927) SparkThriftServer2 can not get ''--hivevar" variables in spark 2.1
[ https://issues.apache.org/jira/browse/SPARK-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bruce xu updated SPARK-19927: - Affects Version/s: 2.0.1 > SparkThriftServer2 can not get ''--hivevar" variables in spark 2.1 > -- > > Key: SPARK-19927 > URL: https://issues.apache.org/jira/browse/SPARK-19927 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.1, 2.1.0 > Environment: CentOS 6.5,spark 2.1 build with mvn -Pyarn -Phadoop-2.6 > -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -Dscala-2.11 >Reporter: bruce xu > > suppose the content of test1.sql: > - > USE ${hivevar:db_name}; > - > > when execute: bin/spark-sql -f /tmp/test.sql --hivevar db_name=offline > the output is: > > Error: org.apache.spark.sql.catalyst.parser.ParseException: > no viable alternative at input ''(line 1, pos 4) > == SQL == > use > ^^^ (state=,code=0) > - > so hivevar can not be read from CLI. > the bug still appears with beeline command: bin/beeline -f /tmp/test2.sql > --hivevar db_name=offline with test2.sql: > > !connect jdbc:hive2://localhost:1 test test > USE ${hivevar:db_name}; > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-19996) transfer spark-defaults.conf to spark-defaults.xml
hanzhi created SPARK-19996: -- Summary: transfer spark-defaults.conf to spark-defaults.xml Key: SPARK-19996 URL: https://issues.apache.org/jira/browse/SPARK-19996 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.1.0 Reporter: hanzhi -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-19995) Using real user to connect HiveMetastore in HiveClientImpl
Saisai Shao created SPARK-19995: --- Summary: Using real user to connect HiveMetastore in HiveClientImpl Key: SPARK-19995 URL: https://issues.apache.org/jira/browse/SPARK-19995 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0 Reporter: Saisai Shao If user specify "--proxy-user" in kerberized environment with Hive catalog implementation, HiveClientImpl will try to connect hive metastore with current user. While we use real user to do kinit, this will make connection failure. We should change like what we did before in yarn code to use real user. {noformat} ERROR TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:188) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270) at org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:173) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:86) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
[jira] [Comment Edited] (SPARK-19992) spark-submit on deployment-mode cluster
[ https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929547#comment-15929547 ] Saisai Shao edited comment on SPARK-19992 at 3/17/17 7:48 AM: -- Looks like I guess wrong from the url you provided. If you're trying to use "spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*", make sure the spark related jars existed under the same path in every node. was (Author: jerryshao): Looks like I guess wrong from the url you provided. If you're trying to use "spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*", make sure these jars existed in every node. > spark-submit on deployment-mode cluster > --- > > Key: SPARK-19992 > URL: https://issues.apache.org/jira/browse/SPARK-19992 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 2.0.2 > Environment: spark version 2.0.2 > hadoop version 2.6.0 >Reporter: narendra maru > > spark version 2.0.2 > hadoop version 2.6.0 > spark -submit command > "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode > cluster --jars /home/ec2-user/jars/hgmongonew.jar, > /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" > after adding following in > 1 Spark-default.conf > spark.executor.extraJavaOptions -Dconfig.fuction.conf > spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory > spark.eventLog.enabled=true > 2yarn-site.xml > > yarn.application.classpath > > /usr/local/hadoop-2.6.0/etc/hadoop, > /usr/local/hadoop-2.6.0/, > /usr/local/hadoop-2.6.0/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/common/, > /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, > /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar > > > Error on log:- > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > Error on terminal:- > diagnostics: Application application_1489673977198_0002 failed 2 times due to > AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 > For more detailed output, check application tracking > page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, > click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster
[ https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929547#comment-15929547 ] Saisai Shao commented on SPARK-19992: - Looks like I guess wrong from the url you provided. If you're trying to use "spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/*", make sure these jars existed in every node. > spark-submit on deployment-mode cluster > --- > > Key: SPARK-19992 > URL: https://issues.apache.org/jira/browse/SPARK-19992 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 2.0.2 > Environment: spark version 2.0.2 > hadoop version 2.6.0 >Reporter: narendra maru > > spark version 2.0.2 > hadoop version 2.6.0 > spark -submit command > "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode > cluster --jars /home/ec2-user/jars/hgmongonew.jar, > /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" > after adding following in > 1 Spark-default.conf > spark.executor.extraJavaOptions -Dconfig.fuction.conf > spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory > spark.eventLog.enabled=true > 2yarn-site.xml > > yarn.application.classpath > > /usr/local/hadoop-2.6.0/etc/hadoop, > /usr/local/hadoop-2.6.0/, > /usr/local/hadoop-2.6.0/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/common/, > /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, > /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar > > > Error on log:- > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > Error on terminal:- > diagnostics: Application application_1489673977198_0002 failed 2 times due to > AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 > For more detailed output, check application tracking > page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, > click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster
[ https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929546#comment-15929546 ] narendra maru commented on SPARK-19992: --- hi saisai i am working on plain Hadoop 2.6.0 so is there need for any other configuration > spark-submit on deployment-mode cluster > --- > > Key: SPARK-19992 > URL: https://issues.apache.org/jira/browse/SPARK-19992 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 2.0.2 > Environment: spark version 2.0.2 > hadoop version 2.6.0 >Reporter: narendra maru > > spark version 2.0.2 > hadoop version 2.6.0 > spark -submit command > "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode > cluster --jars /home/ec2-user/jars/hgmongonew.jar, > /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" > after adding following in > 1 Spark-default.conf > spark.executor.extraJavaOptions -Dconfig.fuction.conf > spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory > spark.eventLog.enabled=true > 2yarn-site.xml > > yarn.application.classpath > > /usr/local/hadoop-2.6.0/etc/hadoop, > /usr/local/hadoop-2.6.0/, > /usr/local/hadoop-2.6.0/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/common/, > /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, > /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar > > > Error on log:- > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > Error on terminal:- > diagnostics: Application application_1489673977198_0002 failed 2 times due to > AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 > For more detailed output, check application tracking > page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, > click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster
[ https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929545#comment-15929545 ] narendra maru commented on SPARK-19992: --- thanks sean for a quick reply I am suffering from last 4-5 days for same error and not getting any way to resolve the same can you please help me with any suggestions for spark deployment on yarn multinode cluster as the same command is working on deployment mode client > spark-submit on deployment-mode cluster > --- > > Key: SPARK-19992 > URL: https://issues.apache.org/jira/browse/SPARK-19992 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 2.0.2 > Environment: spark version 2.0.2 > hadoop version 2.6.0 >Reporter: narendra maru > > spark version 2.0.2 > hadoop version 2.6.0 > spark -submit command > "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode > cluster --jars /home/ec2-user/jars/hgmongonew.jar, > /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" > after adding following in > 1 Spark-default.conf > spark.executor.extraJavaOptions -Dconfig.fuction.conf > spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory > spark.eventLog.enabled=true > 2yarn-site.xml > > yarn.application.classpath > > /usr/local/hadoop-2.6.0/etc/hadoop, > /usr/local/hadoop-2.6.0/, > /usr/local/hadoop-2.6.0/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/common/, > /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, > /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar > > > Error on log:- > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > Error on terminal:- > diagnostics: Application application_1489673977198_0002 failed 2 times due to > AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 > For more detailed output, check application tracking > page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, > click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster
[ https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929544#comment-15929544 ] Saisai Shao commented on SPARK-19992: - Are you using HDP environment, if so I guess need to configuration hdp.version in Spark, you could google it. > spark-submit on deployment-mode cluster > --- > > Key: SPARK-19992 > URL: https://issues.apache.org/jira/browse/SPARK-19992 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 2.0.2 > Environment: spark version 2.0.2 > hadoop version 2.6.0 >Reporter: narendra maru > > spark version 2.0.2 > hadoop version 2.6.0 > spark -submit command > "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode > cluster --jars /home/ec2-user/jars/hgmongonew.jar, > /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" > after adding following in > 1 Spark-default.conf > spark.executor.extraJavaOptions -Dconfig.fuction.conf > spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory > spark.eventLog.enabled=true > 2yarn-site.xml > > yarn.application.classpath > > /usr/local/hadoop-2.6.0/etc/hadoop, > /usr/local/hadoop-2.6.0/, > /usr/local/hadoop-2.6.0/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/common/, > /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, > /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar > > > Error on log:- > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > Error on terminal:- > diagnostics: Application application_1489673977198_0002 failed 2 times due to > AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 > For more detailed output, check application tracking > page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, > click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19992) spark-submit on deployment-mode cluster
[ https://issues.apache.org/jira/browse/SPARK-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929529#comment-15929529 ] Sean Owen commented on SPARK-19992: --- You have some build or environment problem. In particular I don't know that this is valid: spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark-submit on deployment-mode cluster > --- > > Key: SPARK-19992 > URL: https://issues.apache.org/jira/browse/SPARK-19992 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 2.0.2 > Environment: spark version 2.0.2 > hadoop version 2.6.0 >Reporter: narendra maru > > spark version 2.0.2 > hadoop version 2.6.0 > spark -submit command > "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode > cluster --jars /home/ec2-user/jars/hgmongonew.jar, > /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" > after adding following in > 1 Spark-default.conf > spark.executor.extraJavaOptions -Dconfig.fuction.conf > spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* > spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory > spark.eventLog.enabled=true > 2yarn-site.xml > > yarn.application.classpath > > /usr/local/hadoop-2.6.0/etc/hadoop, > /usr/local/hadoop-2.6.0/, > /usr/local/hadoop-2.6.0/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/common/, > /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, > /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, > /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/, > /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, > /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar > > > Error on log:- > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > Error on terminal:- > diagnostics: Application application_1489673977198_0002 failed 2 times due to > AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 > For more detailed output, check application tracking > page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, > click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-19994) Wrong outputOrdering for right/full outer smj
Zhenhua Wang created SPARK-19994: Summary: Wrong outputOrdering for right/full outer smj Key: SPARK-19994 URL: https://issues.apache.org/jira/browse/SPARK-19994 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0 Reporter: Zhenhua Wang For right outer join, values of the left key will be filled with nulls if it can't match the value of the right key, so `nullOrdering` of the left key can't be guaranteed. We should output right key order. For full outer join, neither left key nor right key guarantees `nullOrdering`. We should not output any ordering. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-19984) ERROR codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java'
[ https://issues.apache.org/jira/browse/SPARK-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929496#comment-15929496 ] Takeshi Yamamuro edited comment on SPARK-19984 at 3/17/17 6:41 AM: --- If we could find a reason about this issue, it'd be better to update the title more concretely. Currently, I feel it is too ambiguous. was (Author: maropu): If we could find reason about this issue, it'd be better to update the title more concretely. Currently, I feel it is too ambiguous. > ERROR codegen.CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java' > - > > Key: SPARK-19984 > URL: https://issues.apache.org/jira/browse/SPARK-19984 > Project: Spark > Issue Type: Bug > Components: Optimizer >Affects Versions: 2.1.0 >Reporter: Andrey Yakovenko > > I had this error few time on my local hadoop 2.7.3+Spark2.1.0 environment. > This is not permanent error, next time i run it could disappear. > Unfortunately i don't know how to reproduce the issue. As you can see from > the log my logic is pretty complicated. > Here is a part of log i've got (container_1489514660953_0015_01_01) > {code} > 17/03/16 11:07:04 ERROR codegen.CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 151, Column 29: A method named "compare" is not declared in any enclosing > class nor any supertype, nor through a static import > /* 001 */ public Object generate(Object[] references) { > /* 002 */ return new GeneratedIterator(references); > /* 003 */ } > /* 004 */ > /* 005 */ final class GeneratedIterator extends > org.apache.spark.sql.execution.BufferedRowIterator { > /* 006 */ private Object[] references; > /* 007 */ private scala.collection.Iterator[] inputs; > /* 008 */ private boolean agg_initAgg; > /* 009 */ private boolean agg_bufIsNull; > /* 010 */ private long agg_bufValue; > /* 011 */ private boolean agg_initAgg1; > /* 012 */ private boolean agg_bufIsNull1; > /* 013 */ private long agg_bufValue1; > /* 014 */ private scala.collection.Iterator smj_leftInput; > /* 015 */ private scala.collection.Iterator smj_rightInput; > /* 016 */ private InternalRow smj_leftRow; > /* 017 */ private InternalRow smj_rightRow; > /* 018 */ private UTF8String smj_value2; > /* 019 */ private java.util.ArrayList smj_matches; > /* 020 */ private UTF8String smj_value3; > /* 021 */ private UTF8String smj_value4; > /* 022 */ private org.apache.spark.sql.execution.metric.SQLMetric > smj_numOutputRows; > /* 023 */ private UnsafeRow smj_result; > /* 024 */ private > org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder smj_holder; > /* 025 */ private > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter > smj_rowWriter; > /* 026 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_numOutputRows; > /* 027 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_aggTime; > /* 028 */ private UnsafeRow agg_result; > /* 029 */ private > org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder agg_holder; > /* 030 */ private > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter > agg_rowWriter; > /* 031 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_numOutputRows1; > /* 032 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_aggTime1; > /* 033 */ private UnsafeRow agg_result1; > /* 034 */ private > org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder agg_holder1; > /* 035 */ private > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter > agg_rowWriter1; > /* 036 */ > /* 037 */ public GeneratedIterator(Object[] references) { > /* 038 */ this.references = references; > /* 039 */ } > /* 040 */ > /* 041 */ public void init(int index, scala.collection.Iterator[] inputs) { > /* 042 */ partitionIndex = index; > /* 043 */ this.inputs = inputs; > /* 044 */ wholestagecodegen_init_0(); > /* 045 */ wholestagecodegen_init_1(); > /* 046 */ > /* 047 */ } > /* 048 */ > /* 049 */ private void wholestagecodegen_init_0() { > /* 050 */ agg_initAgg = false; > /* 051 */ > /* 052 */ agg_initAgg1 = false; > /* 053 */ > /* 054 */ smj_leftInput = inputs[0]; > /* 055 */ smj_rightInput = inputs[1]; > /* 056 */ > /* 057 */ smj_rightRow = null; > /* 058 */ > /* 059 */ smj_matches = new java.util.ArrayList(); > /* 060 */ > /* 061 */ this.smj_numOutputRows = > (org.apache.spark.sql.execution.metric.SQLMetric) references[0]; > /* 062 */ smj_result = new UnsafeRow(2); > /* 063 */ this.smj_holder = new >
[jira] [Commented] (SPARK-19984) ERROR codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java'
[ https://issues.apache.org/jira/browse/SPARK-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929496#comment-15929496 ] Takeshi Yamamuro commented on SPARK-19984: -- If we could find reason about this issue, it'd be better to update the title more concretely. Currently, I feel it is too ambiguous. > ERROR codegen.CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java' > - > > Key: SPARK-19984 > URL: https://issues.apache.org/jira/browse/SPARK-19984 > Project: Spark > Issue Type: Bug > Components: Optimizer >Affects Versions: 2.1.0 >Reporter: Andrey Yakovenko > > I had this error few time on my local hadoop 2.7.3+Spark2.1.0 environment. > This is not permanent error, next time i run it could disappear. > Unfortunately i don't know how to reproduce the issue. As you can see from > the log my logic is pretty complicated. > Here is a part of log i've got (container_1489514660953_0015_01_01) > {code} > 17/03/16 11:07:04 ERROR codegen.CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 151, Column 29: A method named "compare" is not declared in any enclosing > class nor any supertype, nor through a static import > /* 001 */ public Object generate(Object[] references) { > /* 002 */ return new GeneratedIterator(references); > /* 003 */ } > /* 004 */ > /* 005 */ final class GeneratedIterator extends > org.apache.spark.sql.execution.BufferedRowIterator { > /* 006 */ private Object[] references; > /* 007 */ private scala.collection.Iterator[] inputs; > /* 008 */ private boolean agg_initAgg; > /* 009 */ private boolean agg_bufIsNull; > /* 010 */ private long agg_bufValue; > /* 011 */ private boolean agg_initAgg1; > /* 012 */ private boolean agg_bufIsNull1; > /* 013 */ private long agg_bufValue1; > /* 014 */ private scala.collection.Iterator smj_leftInput; > /* 015 */ private scala.collection.Iterator smj_rightInput; > /* 016 */ private InternalRow smj_leftRow; > /* 017 */ private InternalRow smj_rightRow; > /* 018 */ private UTF8String smj_value2; > /* 019 */ private java.util.ArrayList smj_matches; > /* 020 */ private UTF8String smj_value3; > /* 021 */ private UTF8String smj_value4; > /* 022 */ private org.apache.spark.sql.execution.metric.SQLMetric > smj_numOutputRows; > /* 023 */ private UnsafeRow smj_result; > /* 024 */ private > org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder smj_holder; > /* 025 */ private > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter > smj_rowWriter; > /* 026 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_numOutputRows; > /* 027 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_aggTime; > /* 028 */ private UnsafeRow agg_result; > /* 029 */ private > org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder agg_holder; > /* 030 */ private > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter > agg_rowWriter; > /* 031 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_numOutputRows1; > /* 032 */ private org.apache.spark.sql.execution.metric.SQLMetric > agg_aggTime1; > /* 033 */ private UnsafeRow agg_result1; > /* 034 */ private > org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder agg_holder1; > /* 035 */ private > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter > agg_rowWriter1; > /* 036 */ > /* 037 */ public GeneratedIterator(Object[] references) { > /* 038 */ this.references = references; > /* 039 */ } > /* 040 */ > /* 041 */ public void init(int index, scala.collection.Iterator[] inputs) { > /* 042 */ partitionIndex = index; > /* 043 */ this.inputs = inputs; > /* 044 */ wholestagecodegen_init_0(); > /* 045 */ wholestagecodegen_init_1(); > /* 046 */ > /* 047 */ } > /* 048 */ > /* 049 */ private void wholestagecodegen_init_0() { > /* 050 */ agg_initAgg = false; > /* 051 */ > /* 052 */ agg_initAgg1 = false; > /* 053 */ > /* 054 */ smj_leftInput = inputs[0]; > /* 055 */ smj_rightInput = inputs[1]; > /* 056 */ > /* 057 */ smj_rightRow = null; > /* 058 */ > /* 059 */ smj_matches = new java.util.ArrayList(); > /* 060 */ > /* 061 */ this.smj_numOutputRows = > (org.apache.spark.sql.execution.metric.SQLMetric) references[0]; > /* 062 */ smj_result = new UnsafeRow(2); > /* 063 */ this.smj_holder = new > org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(smj_result, > 64); > /* 064 */ this.smj_rowWriter = new > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(smj_holder, > 2); > /* 065 */
[jira] [Created] (SPARK-19993) Caching logical plans containing subquery expressions does not work.
Dilip Biswal created SPARK-19993: Summary: Caching logical plans containing subquery expressions does not work. Key: SPARK-19993 URL: https://issues.apache.org/jira/browse/SPARK-19993 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 2.1.0 Reporter: Dilip Biswal Here is a simple repro that depicts the problem. In this case the second invocation of the sql should have been from the cache. However the lookup fails currently. {code} scala> val ds = spark.sql("select * from s1 where s1.c1 in (select s2.c1 from s2 where s1.c1 = s2.c1)") ds: org.apache.spark.sql.DataFrame = [c1: int] scala> ds.cache res13: ds.type = [c1: int] scala> spark.sql("select * from s1 where s1.c1 in (select s2.c1 from s2 where s1.c1 = s2.c1)").explain(true) == Analyzed Logical Plan == c1: int Project [c1#86] +- Filter c1#86 IN (list#78 [c1#86]) : +- Project [c1#87] : +- Filter (outer(c1#86) = c1#87) :+- SubqueryAlias s2 : +- Relation[c1#87] parquet +- SubqueryAlias s1 +- Relation[c1#86] parquet == Optimized Logical Plan == Join LeftSemi, ((c1#86 = c1#87) && (c1#86 = c1#87)) :- Relation[c1#86] parquet +- Relation[c1#87] parquet {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-19992) spark-submit on deployment-mode cluster
narendra maru created SPARK-19992: - Summary: spark-submit on deployment-mode cluster Key: SPARK-19992 URL: https://issues.apache.org/jira/browse/SPARK-19992 Project: Spark Issue Type: Bug Components: Spark Submit Affects Versions: 2.0.2 Environment: spark version 2.0.2 hadoop version 2.6.0 Reporter: narendra maru spark version 2.0.2 hadoop version 2.6.0 spark -submit command "spark-submit --class spark.mongohadoop.testing3 --master yarn --deploy-mode cluster --jars /home/ec2-user/jars/hgmongonew.jar, /home/ec2-user/jars/mongo-hadoop-spark-2.0.1.jar" after adding following in 1 Spark-default.conf spark.executor.extraJavaOptions -Dconfig.fuction.conf spark.yarn.jars=local:/usr/local/spark-2.0.2-bin-hadoop2.6/yarn/* spark.eventLog.dir=hdfs://localhost:9000/user/spark/applicationHistory spark.eventLog.enabled=true 2yarn-site.xml yarn.application.classpath /usr/local/hadoop-2.6.0/etc/hadoop, /usr/local/hadoop-2.6.0/, /usr/local/hadoop-2.6.0/lib/, /usr/local/hadoop-2.6.0/share/hadoop/common/, /usr/local/hadoop-2.6.0/share/hadoop/common/lib/ /usr/local/hadoop-2.6.0/share/hadoop/hdfs/, /usr/local/hadoop-2.6.0/share/hadoop/hdfs/lib/, /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/, /usr/local/hadoop-2.6.0/share/hadoop/mapreduce/lib/, /usr/local/hadoop-2.6.0/share/hadoop/tools/lib/, /usr/local/hadoop-2.6.0/share/hadoop/yarn/, /usr/local/hadoop-2.6.0/share/hadoop/yarn/lib/*, /usr/local/spark-2.0.2-bin-hadoop2.6/jars/spark-yarn_2.11-2.0.2.jar Error on log:- Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster Error on terminal:- diagnostics: Application application_1489673977198_0002 failed 2 times due to AM Container for appattempt_1489673977198_0002_02 exited with exitCode: 1 For more detailed output, check application tracking page:http://bdg-hdp-sparkmaster:8088/proxy/application_1489673977198_0002/Then, click on links to logs of each attempt. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org