date:20161213

[GitHub] spark issue #15987: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...

2016-12-13 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/15987
  
Hi, @hvanhovell .
I will remove the support of expression from this PR. Is it okay for you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16265: [SPARK-18840][YARN] Avoid throw exception when ge...

2016-12-13 Thread mridulm

Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/16265#discussion_r92232169
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HDFSCredentialProvider.scala
 ---
@@ -72,21 +72,22 @@ private[security] class HDFSCredentialProvider extends 
ServiceCredentialProvider
 // We cannot use the tokens generated with renewer yarn. Trying to 
renew
 // those will fail with an access control issue. So create new tokens 
with the logged in
 // user as renewer.
-sparkConf.get(PRINCIPAL).map { renewer =>
+sparkConf.get(PRINCIPAL).flatMap { renewer =>
   val creds = new Credentials()
   nnsToAccess(hadoopConf, sparkConf).foreach { dst =>
 val dstFs = dst.getFileSystem(hadoopConf)
 dstFs.addDelegationTokens(renewer, creds)
   }
-  val t = creds.getAllTokens.asScala
-.filter(_.getKind == 
DelegationTokenIdentifier.HDFS_DELEGATION_KIND)
-.head
-  val newExpiration = t.renew(hadoopConf)
-  val identifier = new DelegationTokenIdentifier()
-  identifier.readFields(new DataInputStream(new 
ByteArrayInputStream(t.getIdentifier)))
-  val interval = newExpiration - identifier.getIssueDate
-  logInfo(s"Renewal Interval is $interval")
-  interval
+  val hdfsToken = creds.getAllTokens.asScala
+.find(_.getKind == DelegationTokenIdentifier.HDFS_DELEGATION_KIND)
+  hdfsToken.map { t =>
--- End diff --

My bad. s/find/filter/g in my head.

LGTM !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16252: [SPARK-18827][Core] Fix cannot read broadcast on ...

2016-12-13 Thread mridulm

Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/16252#discussion_r92231761
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -694,7 +694,7 @@ private[storage] class PartiallyUnrolledIterator[T](
   }
 
   override def next(): T = {
-if (unrolled == null) {
+if (unrolled == null || !unrolled.hasNext) {
--- End diff --

You are right, next() without hasNext is a valid code flow, and our code 
should not break due to caller not invoking hasNext (at best throw 
NoSuchElementException if hasNext == false).
Another option is to add hasNext check here - but that would be worse 
(since normal flow will then check hasNext twice).

If we cant ensure "require(unrolled == null || unrolled.hasNext)", then 
current change is best we can do I guess.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-12-13 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14638
  
Thank you all so much for reviewing and discussion on this issue since Aug 
14th.
Since we focused this fully both here and in Spark mailing list and it has 
been 4 months already, I'll happily close this PR and SPARK-11374 as a WON'T 
FIX tomorrow (Dec. 14th).
All of your time is too precious to be stuck here any more. Apache Spark 
still have many things to do. Go Spark!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread kayousterhout

Github user kayousterhout commented on a diff in the pull request:

https://github.com/apache/spark/pull/14079#discussion_r92231466
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala 
---
@@ -157,8 +160,16 @@ abstract class SchedulerIntegrationSuite[T <: 
MockBackend: ClassTag] extends Spa
   }
   // When a job fails, we terminate before waiting for all the task 
end events to come in,
   // so there might still be a running task set.  So we only check 
these conditions
-  // when the job succeeds
-  assert(taskScheduler.runningTaskSets.isEmpty)
+  // when the job succeeds.
+  // When the final task of a taskset completes, we post
+  // the event to the DAGScheduler event loop before we finish 
processing in the taskscheduler
+  // thread.  Its possible the DAGScheduler thread processes the 
event, finishes the job,
+  // and notifies the job waiter before our original thread in the 
task scheduler finishes
+  // handling the event and marks the taskset as complete.  So its ok 
if we need to wait a
+  // *little* bit longer for the original taskscheduler thread to 
finish up to deal w/ the race.
+  eventually(timeout(1 second), interval(100 millis)) {
+assert(taskScheduler.runningTaskSets.isEmpty)
--- End diff --

Cool sounds good re: separate fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16246: [SPARK-18814][SQL] CheckAnalysis rejects TPCDS query 32

2016-12-13 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16246
  
Sure, thanks! @nsyca 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16262: [SPARK-17932][SQL][FOLLOWUP] Change statement `SHOW TABL...

2016-12-13 Thread hvanhovell

Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/16262
  
@gatorsmile yeah you are right about that. @jiangxb1987 ignore my last 
request :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16252: [SPARK-18827][Core] Fix cannot read broadcast on ...

2016-12-13 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16252#discussion_r92229647
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -694,7 +694,7 @@ private[storage] class PartiallyUnrolledIterator[T](
   }
 
   override def next(): T = {
-if (unrolled == null) {
+if (unrolled == null || !unrolled.hasNext) {
--- End diff --

Yeah that's a fair point because it should be fair to put the burden on the 
caller to check `hasNext` before calling `next` and now `TorrentBroadcast` does 
that. However, are there other call sites that need that type of fix too? if 
all callers are well behaved then I agree we could revert the added `hasNext` 
call in `next`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-13 Thread squito

Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/14079#discussion_r92229579
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala 
---
@@ -157,8 +160,16 @@ abstract class SchedulerIntegrationSuite[T <: 
MockBackend: ClassTag] extends Spa
   }
   // When a job fails, we terminate before waiting for all the task 
end events to come in,
   // so there might still be a running task set.  So we only check 
these conditions
-  // when the job succeeds
-  assert(taskScheduler.runningTaskSets.isEmpty)
+  // when the job succeeds.
+  // When the final task of a taskset completes, we post
+  // the event to the DAGScheduler event loop before we finish 
processing in the taskscheduler
+  // thread.  Its possible the DAGScheduler thread processes the 
event, finishes the job,
+  // and notifies the job waiter before our original thread in the 
task scheduler finishes
+  // handling the event and marks the taskset as complete.  So its ok 
if we need to wait a
+  // *little* bit longer for the original taskscheduler thread to 
finish up to deal w/ the race.
+  eventually(timeout(1 second), interval(100 millis)) {
+assert(taskScheduler.runningTaskSets.isEmpty)
--- End diff --

no, just that one of test runs failed because of this  
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68694/consoleFull

so I included the fix here in case there was more flakiness.  But actually 
I dont' think I've seen this failure any other time, so I can move it out to a 
separate fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16262: [SPARK-17932][SQL][FOLLOWUP] Change statement `SHOW TABL...

2016-12-13 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16262
  
@hvanhovell We did not merge the original PR 
https://github.com/apache/spark/pull/15958 to 2.1. No need to backport this to 
Spark 2.1, right? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16262: [SPARK-17932][SQL][FOLLOWUP] Change statement `SH...

2016-12-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16262


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16246: [SPARK-18814][SQL] CheckAnalysis rejects TPCDS query 32

2016-12-13 Thread nsyca

Github user nsyca commented on the issue:

https://github.com/apache/spark/pull/16246
  
I am working on the code based on @hvanhovell's proposal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16262: [SPARK-17932][SQL][FOLLOWUP] Change statement `SHOW TABL...

2016-12-13 Thread hvanhovell

Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/16262
  
@jiangxb1987 could you open a backport for branch 2.1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16260: [SPARK-18835][sql] Don't expose Guava types in th...

2016-12-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16260


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16262: [SPARK-17932][SQL][FOLLOWUP] Change statement `SHOW TABL...

2016-12-13 Thread hvanhovell

Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/16262
  
LGTM - merging to master/2.1. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16260: [SPARK-18835][sql] Don't expose Guava types in the JavaT...

2016-12-13 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/16260
  
Merging to master / 2.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/16253
  
> BTW, do you (or anybody else) know the purpose of 
JavaStreamingListenerAPISuite.java? 

I'd ask the person who wrote it ("git blame"). Looks like a compile-time 
check for the listener API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16176: [SPARK-18746][SQL] Add implicit encoder for BigDecimal, ...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16176
  
**[Test build #70090 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70090/consoleFull)**
 for PR 16176 at commit 
[`8f1a85d`](https://github.com/apache/spark/commit/8f1a85d3af302753aec103739391c1ef2c8d7ead).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-13 Thread merlintang

Github user merlintang commented on the issue:

https://github.com/apache/spark/pull/15819
  
Great, once the  #16134  is
done, we can backport them together.

On Tue, Dec 13, 2016 at 12:18 AM, Wenchen Fan 
wrote:

> yea, I think we should backport a complete staging dir cleanup
> functionality to 1.6, let's wait for #16134
> 
>
> â
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16230: [SPARK-13747][Core]Fix potential ThreadLocal leak...

2016-12-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16230


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16176: [SPARK-18746][SQL] Add implicit encoder for BigDecimal, ...

2016-12-13 Thread weiqingy

Github user weiqingy commented on the issue:

https://github.com/apache/spark/pull/16176
  
Yes, I have updated the PR. @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16230: [SPARK-13747][Core]Fix potential ThreadLocal leaks in RP...

2016-12-13 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/16230
  
Merging to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16230: [SPARK-13747][Core]Fix potential ThreadLocal leaks in RP...

2016-12-13 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/16230
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16104: [SPARK-18675][SQL] CTAS for hive serde table shou...

2016-12-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16104


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16252: [SPARK-18827][Core] Fix cannot read broadcast on ...

2016-12-13 Thread mridulm

Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/16252#discussion_r92224929
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -694,7 +694,7 @@ private[storage] class PartiallyUnrolledIterator[T](
   }
 
   override def next(): T = {
-if (unrolled == null) {
+if (unrolled == null || !unrolled.hasNext) {
--- End diff --

Ideally, only the null check should be there - with the !hasNext enforced 
as unrolled = null if false.
This is part of a tight loop, and would be better if the footprint is kept 
as small as possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16104: [SPARK-18675][SQL] CTAS for hive serde table should work...

2016-12-13 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/16104
  
LGTM. Thanks. Merging to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in cla...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16266
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70085/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in cla...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16266
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in cla...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16266
  
**[Test build #70085 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70085/consoleFull)**
 for PR 16266 at commit 
[`41752b8`](https://github.com/apache/spark/commit/41752b8c8b84552c01a78591e81cc89a25af6ec5).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16267
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70087/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16267
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16265: [SPARK-18840][YARN] Avoid throw exception when ge...

2016-12-13 Thread mridulm

Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/16265#discussion_r92220391
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HDFSCredentialProvider.scala
 ---
@@ -72,21 +72,22 @@ private[security] class HDFSCredentialProvider extends 
ServiceCredentialProvider
 // We cannot use the tokens generated with renewer yarn. Trying to 
renew
 // those will fail with an access control issue. So create new tokens 
with the logged in
 // user as renewer.
-sparkConf.get(PRINCIPAL).map { renewer =>
+sparkConf.get(PRINCIPAL).flatMap { renewer =>
   val creds = new Credentials()
   nnsToAccess(hadoopConf, sparkConf).foreach { dst =>
 val dstFs = dst.getFileSystem(hadoopConf)
 dstFs.addDelegationTokens(renewer, creds)
   }
-  val t = creds.getAllTokens.asScala
-.filter(_.getKind == 
DelegationTokenIdentifier.HDFS_DELEGATION_KIND)
-.head
-  val newExpiration = t.renew(hadoopConf)
-  val identifier = new DelegationTokenIdentifier()
-  identifier.readFields(new DataInputStream(new 
ByteArrayInputStream(t.getIdentifier)))
-  val interval = newExpiration - identifier.getIssueDate
-  logInfo(s"Renewal Interval is $interval")
-  interval
+  val hdfsToken = creds.getAllTokens.asScala
+.find(_.getKind == DelegationTokenIdentifier.HDFS_DELEGATION_KIND)
+  hdfsToken.map { t =>
--- End diff --

Is this gauranteed to have only a single value ? If not, it could be 
changing the cardinality of what is returned compared to before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16267
  
**[Test build #70087 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70087/consoleFull)**
 for PR 16267 at commit 
[`ab6d8d4`](https://github.com/apache/spark/commit/ab6d8d45a95adef41cd8f5bd1e520c999067705e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15279: SPARK-12347 [ML][WIP] Add a script to test Spark ML exam...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15279
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70088/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15279: SPARK-12347 [ML][WIP] Add a script to test Spark ML exam...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15279
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15279: SPARK-12347 [ML][WIP] Add a script to test Spark ML exam...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15279
  
**[Test build #70088 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70088/consoleFull)**
 for PR 15279 at commit 
[`7b9fe15`](https://github.com/apache/spark/commit/7b9fe15513c4ee3b4fac565cdf54973d78cdd4db).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15717
  
**[Test build #70089 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70089/consoleFull)**
 for PR 15717 at commit 
[`57934d4`](https://github.com/apache/spark/commit/57934d4bba2887a2cacff45e6303e96fa9759e1d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16246: [SPARK-18814][SQL] CheckAnalysis rejects TPCDS query 32

2016-12-13 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16246
  
@hvanhovell I like your ideas. : ) 

This JIRA is in the `Blocker` level of Spark 2.1. If we are doing a major 
refactoring in `CheckAnalysis `, is it too risky at the last minute? I am fine 
if you think the above proposal has a limited impact. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16256: [SPARK-18816] [Web UI] Executors Logs column only...

2016-12-13 Thread ajbozarth

Github user ajbozarth commented on a diff in the pull request:

https://github.com/apache/spark/pull/16256#discussion_r92199279
  
--- Diff: 
core/src/main/resources/org/apache/spark/ui/static/executorspage.js ---
@@ -412,18 +412,15 @@ $(document).ready(function () {
 ],
 "columnDefs": [
 {
-"targets": [ 15 ],
-"visible": logsExist(response)
-},
-{
 "targets": [ 16 ],
 "visible": getThreadDumpEnabled()
 }
 ],
 "order": [[0, "asc"]]
 };
 
-$(selector).DataTable(conf);
+var dt = $(selector).DataTable(conf);
--- End diff --

No, column 16 logics is based on a conf flag, 15 is based on a function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16255: [SPARK-18609][SQL]Fix when CTE with Join between ...

2016-12-13 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/16255#discussion_r92197200
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -200,6 +200,8 @@ object RemoveAliasOnlyProject extends Rule[LogicalPlan] 
{
 case plan: Project if plan eq proj => plan.child
 case plan => plan transformExpressions {
   case a: Attribute if attrMap.contains(a) => attrMap(a)
+  case b: Alias if attrMap.exists(_._1.exprId == b.exprId)
--- End diff --

It looks like semanticEquals might be broken (this should be, but perhaps 
this is for a reason. @cloud-fan any idea?

Please use a proper pattern match here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15279: SPARK-12347 [ML][WIP] Add a script to test Spark ML exam...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15279
  
**[Test build #70088 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70088/consoleFull)**
 for PR 15279 at commit 
[`7b9fe15`](https://github.com/apache/spark/commit/7b9fe15513c4ee3b4fac565cdf54973d78cdd4db).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16267
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70086/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16267
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16267
  
**[Test build #70086 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70086/consoleFull)**
 for PR 16267 at commit 
[`b7b27af`](https://github.com/apache/spark/commit/b7b27af579b7f3f1df31834575b9b3a994c2d806).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16255: [SPARK-18609][SQL]Fix when CTE with Join between ...

2016-12-13 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/16255#discussion_r92186722
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -416,8 +418,8 @@ object ColumnPruning extends Rule[LogicalPlan] {
 case w: Window if w.windowExpressions.isEmpty => w.child
 
 // Eliminate no-op Projects
-case p @ Project(_, child) if sameOutput(child.output, p.output) => 
child
-
+case p @ Project(_, child) if sameOutput(child.output, p.output) =>
+  if (child.isInstanceOf[CatalogRelation]) p else child
--- End diff --

Why is a CatalogRelation different?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16267
  
**[Test build #70087 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70087/consoleFull)**
 for PR 16267 at commit 
[`ab6d8d4`](https://github.com/apache/spark/commit/ab6d8d45a95adef41cd8f5bd1e520c999067705e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16255: [SPARK-18609][SQL]Fix when CTE with Join between two tab...

2016-12-13 Thread windpiger

Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/16255
  
@hvanhovell please help to review this, thanks a lot!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16252: [SPARK-18827][Core] Fix cannot read broadcast on disk

2016-12-13 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16252
  
@wangyum Thanks! LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion throw e...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16267
  
**[Test build #70086 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70086/consoleFull)**
 for PR 16267 at commit 
[`b7b27af`](https://github.com/apache/spark/commit/b7b27af579b7f3f1df31834575b9b3a994c2d806).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16267: [SPARK-18841][SQL]fix PushProjectionThroughUnion ...

2016-12-13 Thread windpiger

GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/16267

[SPARK-18841][SQL]fix PushProjectionThroughUnion throw exception when there 
are same column name 

## What changes were proposed in this pull request?
  a union SQL with the same column name, after apply the rule 
RemoveAliasOnlyProject&PushProjectionThroughUnion many times, it will throw a 
exception

The reason is that RemoveAliasOnlyProject rule remove the left project 
child(alias only project) of Union, and replace the attribute of Union & the 
right project child of Union, then apply PushProjectionThroughUnion rule ,as 
the output attributes
of a union are always equal to the left child's output, so it will throw a 
exception that the left child do not contain a attribute of the Union.

for example:
```
   >spark.sql("DROP TABLE IF EXISTS p1")
   >spark.sql("DROP TABLE IF EXISTS p2")
>spark.sql("DROP TABLE IF EXISTS p3")

>spark.sql("CREATE TABLE p1 (col int)" )
>spark.sql("CREATE TABLE p2 (col int)")
>spark.sql("CREATE TABLE p3 (col int)")
>spark.sql("set spark.sql.crossJoin.enabled = true")
   >spark.sql("SELECT 1 as cste,col FROM (SELECT col as col FROM (SELECT 
p1.col as col FROM p1 LEFT JOIN p2 UNION ALL SELECT col FROM p3 ) T1) T2").show
```

exception:
```
key not found: col#16
java.util.NoSuchElementException: key not found: col#16
at scala.collection.MapLike$class.default(MapLike.scala:228)
at 
org.apache.spark.sql.catalyst.expressions.AttributeMap.default(AttributeMap.scala:31)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at 
org.apache.spark.sql.catalyst.expressions.AttributeMap.apply(AttributeMap.scala:31)
at 
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion$$anonfun$2.applyOrElse(Optimizer.scala:346)
at 
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion$$anonfun$2.applyOrElse(Optimizer.scala:345)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:291)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:281)
at 
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion$.org$apache$spark$sql$catalyst$optimizer$PushProjectionThroughUnion$$pushToRight(Optimizer.scala:345)
at 
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion$$anonfun$apply$4$$anonfun$8$$anonfun$apply$31.apply(Optimizer.scala:378)
at 
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion$$anonfun$apply$4$$anonfun$8$$anonfun$apply$31.apply(Optimizer.scala:378)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at 
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion$$anonfun$apply$4$$anonfun$8.apply(Optimizer.scala:378)
```

## How was this patch tested?
unit test added

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark 
FixPushDownUnionProjWithRemoveOnlyAliasProj

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16267.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16267


commit b7b27af579b7f3f1df31834575b9b3a994c2d806
Author: windpiger 
Date:   2016-12-13T14:34:22Z

[SPARK-18841][SQL]fix PushProjectionThroughUnion throw exception when there 
are same column name




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in cla...

2016-12-13 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16266
  
Build started: [TESTS] `org.apache.spark.ShuffleSuite` 
[![PR-16266](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=E05665C1-B7BA-4505-9821-0E82EA9AB928&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/E05665C1-B7BA-4505-9821-0E82EA9AB928)
Build started: [TESTS] 
`org.apache.spark.sql.execution.joins.BroadcastJoinSuite` 
[![PR-16266](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=1B112E86-0188-4603-BE28-3B184789C802&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/1B112E86-0188-4603-BE28-3B184789C802)
Diff: 
https://github.com/apache/spark/compare/master...spark-test:E05665C1-B7BA-4505-9821-0E82EA9AB928


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths...

2016-12-13 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16266#discussion_r92182837
  
--- Diff: project/SparkBuild.scala ---
@@ -824,7 +824,8 @@ object TestSettings {
 // launched by the tests have access to the correct test-time 
classpath.
 envVars in Test ++= Map(
   "SPARK_DIST_CLASSPATH" ->
-(fullClasspath in 
Test).value.files.map(_.getAbsolutePath).mkString(":").stripSuffix(":"),
+(fullClasspath in Test).value.files.map(_.getAbsolutePath)
+  .mkString(File.pathSeparator).stripSuffix(File.pathSeparator),
--- End diff --

This is a required change because in `addToClassPath`, it is split with 
`File.pathSeparator`. If this is `:`, then it is not split and ends up with 
long duplicated paths.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in cla...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16266
  
**[Test build #70085 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70085/consoleFull)**
 for PR 16266 at commit 
[`41752b8`](https://github.com/apache/spark/commit/41752b8c8b84552c01a78591e81cc89a25af6ec5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in cla...

2016-12-13 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16266
  
It seems I underestimated the instances using `local-cluster` causing 
failures due to the length limitation by paths and there are more than I 
expected. For example, there is `LocalClusterSparkContext` which is extended by 
some suite cases so I missed them as finding via `grep`.

Thank you for approving. I will try to double-check.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16266: [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths...

2016-12-13 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/16266

[SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in classpaths in 
processes for local-cluster mode to work around the path length limitation on 
Windows

## What changes were proposed in this pull request?

Currently, some tests are being failed and hanging on Windows due to this 
problem. For the reason in SPARK-18718, some tests using `local-cluster` mode 
were disabled on Windows due to the length limitation by paths given to 
classpaths.

The limitation seems roughly 32K (see the [blog in 
MS](https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/) and 
[another 
reference](https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows))
 but in `local-cluster` mode, executors were being launched as processes with 
the command such as 
[here](https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea) in 
(only) tests.

This length is roughly 40K due to the classpaths given to `java` command. 
However, it seems there are duplicates more than half. So, if we de-duplicate 
the paths, it seems reduced to roughly 20K with the command, 
[here](https://gist.github.com/HyukjinKwon/dad0c8db897e5e094684a2dc6a417790).

Maybe, we should consider as some more paths are added in the future but it 
seems better than disabling all the tests for now with minimised changes.

Therefore, this PR proposes to deduplicate the paths in classpaths in case 
of launching executors as processes in `local-cluster` mode.


## How was this patch tested?

Existing tests in `ShuffleSuite` and `BroadcastJoinSuite` manually via 
AppVeyor




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark disable-local-cluster-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16266.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16266


commit 41752b8c8b84552c01a78591e81cc89a25af6ec5
Author: hyukjinkwon 
Date:   2016-12-13T13:35:42Z

Deduplicate paths in classpath to workaround length limitation on Windows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15915
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70083/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15915
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15915
  
**[Test build #70083 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70083/consoleFull)**
 for PR 15915 at commit 
[`b66b277`](https://github.com/apache/spark/commit/b66b27704000e742beaddcb9d701a2af265ba6d7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16252: [SPARK-18827][Core] Fix cannot read broadcast on disk

2016-12-13 Thread wangyum

Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/16252
  
@srowen @viirya I have added it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16252: [SPARK-18827][Core] Fix cannot read broadcast on disk

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16252
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70080/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16252: [SPARK-18827][Core] Fix cannot read broadcast on disk

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16252
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16252: [SPARK-18827][Core] Fix cannot read broadcast on disk

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16252
  
**[Test build #70080 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70080/consoleFull)**
 for PR 16252 at commit 
[`f004740`](https://github.com/apache/spark/commit/f00474032dab2561e85d3c2fd7aad01d0dcacc8e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16253
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16253
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70079/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16253
  
**[Test build #70079 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70079/consoleFull)**
 for PR 16253 at commit 
[`7891017`](https://github.com/apache/spark/commit/78910172b44d39929dd824508355c7e20c28dd9f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15915: [SPARK-18485][CORE] Underlying integer overflow w...

2016-12-13 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15915#discussion_r92169149
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -331,7 +331,12 @@ private[spark] class MemoryStore(
 var unrollMemoryUsedByThisBlock = 0L
 // Underlying buffer for unrolling the block
 val redirectableStream = new RedirectableOutputStream
-val bbos = new 
ChunkedByteBufferOutputStream(initialMemoryThreshold.toInt, allocator)
+val chunkSize = if (initialMemoryThreshold > Int.MaxValue) {
+  4 * 1024 * 1024
--- End diff --

Because `initialMemoryThreshold` is set to `unrollMemoryThreshold` which is 
configurable, if we want to use a default value in this case, we should log a 
warning message at least.

Another option is just throwing an exception. I am ok for either one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16263
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16263
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70084/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16263
  
**[Test build #70084 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70084/consoleFull)**
 for PR 16263 at commit 
[`13e16fd`](https://github.com/apache/spark/commit/13e16fd11c5bed047dbc330e36f2f1a430ad2a2d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16252: [SPARK-18827][Core] Fix cannot read broadcast on disk

2016-12-13 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16252
  
+1
> @wangyum how about adding @viirya 's suggested change to 
TorrentBroadcast? it would be even more robust.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16263
  
Dealing with the difference between python2 and python3...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16263
  
**[Test build #70084 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70084/consoleFull)**
 for PR 16263 at commit 
[`13e16fd`](https://github.com/apache/spark/commit/13e16fd11c5bed047dbc330e36f2f1a430ad2a2d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16259: [Minor][SparkR]:fix kstest example error and add unit te...

2016-12-13 Thread yanboliang

Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/16259
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge vector...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16037
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70081/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge vector...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16037
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge vector...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16037
  
**[Test build #70081 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70081/consoleFull)**
 for PR 16037 at commit 
[`18fcbba`](https://github.com/apache/spark/commit/18fcbba81168c741497c57ede2554ed0c8e48d2c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16263
  
**[Test build #70082 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70082/consoleFull)**
 for PR 16263 at commit 
[`07635df`](https://github.com/apache/spark/commit/07635dfce3b5a14001e74495199530ac9c16386b).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16263
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70082/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16263
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16145: [MINOR][CORE][SQL] Remove explicit RDD and Partition ove...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16145
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16145: [MINOR][CORE][SQL] Remove explicit RDD and Partition ove...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16145
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70078/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16145: [MINOR][CORE][SQL] Remove explicit RDD and Partition ove...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16145
  
**[Test build #70078 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70078/consoleFull)**
 for PR 16145 at commit 
[`45f7f67`](https://github.com/apache/spark/commit/45f7f67a74421ffdb653e7fe4a30151b692892a2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16263: [SPARK-18281][SQL][PySpark] Consumes the returned local ...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16263
  
**[Test build #70082 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70082/consoleFull)**
 for PR 16263 at commit 
[`07635df`](https://github.com/apache/spark/commit/07635dfce3b5a14001e74495199530ac9c16386b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15915
  
**[Test build #70083 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70083/consoleFull)**
 for PR 15915 at commit 
[`b66b277`](https://github.com/apache/spark/commit/b66b27704000e742beaddcb9d701a2af265ba6d7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge vector...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16037
  
**[Test build #70081 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70081/consoleFull)**
 for PR 16037 at commit 
[`18fcbba`](https://github.com/apache/spark/commit/18fcbba81168c741497c57ede2554ed0c8e48d2c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge...

2016-12-13 Thread AnthonyTruchet

Github user AnthonyTruchet commented on a diff in the pull request:

https://github.com/apache/spark/pull/16037#discussion_r92150485
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala ---
@@ -241,16 +241,24 @@ object LBFGS extends Logging {
   val bcW = data.context.broadcast(w)
   val localGradient = gradient
 
-  val (gradientSum, lossSum) = data.treeAggregate((Vectors.zeros(n), 
0.0))(
-  seqOp = (c, v) => (c, v) match { case ((grad, loss), (label, 
features)) =>
-val l = localGradient.compute(
-  features, label, bcW.value, grad)
-(grad, loss + l)
-  },
-  combOp = (c1, c2) => (c1, c2) match { case ((grad1, loss1), 
(grad2, loss2)) =>
-axpy(1.0, grad2, grad1)
-(grad1, loss1 + loss2)
-  })
+  val seqOp = (c: (Vector, Double), v: (Double, Vector)) =>
+(c, v) match {
+  case ((grad, loss), (label, features)) =>
+val denseGrad = grad.toDense
+val l = localGradient.compute(features, label, bcW.value, 
denseGrad)
+(denseGrad, loss + l)
+}
+
+  val combOp = (c1: (Vector, Double), c2: (Vector, Double)) =>
+(c1, c2) match { case ((grad1, loss1), (grad2, loss2)) =>
+  val denseGrad1 = grad1.toDense
--- End diff --

I did thanks and I merged it. I was just comming back to this contrib when 
I saw it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16252: [SPARK-18827][Core] Fix cannot read broadcast on disk

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16252
  
**[Test build #70080 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70080/consoleFull)**
 for PR 16252 at commit 
[`f004740`](https://github.com/apache/spark/commit/f00474032dab2561e85d3c2fd7aad01d0dcacc8e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread saturday-shi

Github user saturday-shi commented on the issue:

https://github.com/apache/spark/pull/16253
  
@vanzin 
Thank you for your suggestions. Actually, `ApiStreamingRootResource` would 
be a better name.

BTW, do you (or anybody else) know the purpose of 
[JavaStreamingListenerAPISuite.java](https://github.com/apache/spark/blob/master/streaming/src/test/java/org/apache/spark/streaming/JavaStreamingListenerAPISuite.java)?
 When I adding some test, I found it seems never used by any code, so I just 
left it away. Should we delete it, or make change in it anyway?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16253: [SPARK-18537][Web UI] Add a REST api to serve spark stre...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16253
  
**[Test build #70079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70079/consoleFull)**
 for PR 16253 at commit 
[`7891017`](https://github.com/apache/spark/commit/78910172b44d39929dd824508355c7e20c28dd9f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16149
  
**[Test build #3495 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3495/consoleFull)**
 for PR 16149 at commit 
[`6e6c48b`](https://github.com/apache/spark/commit/6e6c48b79065666e1e896eec76e1ffa8cb751b6e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15996
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15996
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70074/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15996
  
**[Test build #70074 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70074/consoleFull)**
 for PR 15996 at commit 
[`172f6eb`](https://github.com/apache/spark/commit/172f6eb5eeb36819aaf731c547540c5af90c49cc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16257: [SPARK-18752][sql] Follow-up: add scaladoc explaining is...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16257
  
**[Test build #3494 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3494/consoleFull)**
 for PR 16257 at commit 
[`b8cdd4f`](https://github.com/apache/spark/commit/b8cdd4f7fff7f1c7d346516a80ef70cd103a90bd).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-13 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13909#discussion_r92142517
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -56,33 +58,93 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
   }
 
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
-val arrayClass = classOf[GenericArrayData].getName
-val values = ctx.freshName("values")
-ctx.addMutableState("Object[]", values, s"this.$values = null;")
+val array = ctx.freshName("array")
 
-ev.copy(code = s"""
-  this.$values = new Object[${children.size}];""" +
+val et = dataType.elementType
+val evals = children.map(e => e.genCode(ctx))
+val isPrimitiveArray = ctx.isPrimitiveType(et)
+val primitiveTypeName = if (isPrimitiveArray) 
ctx.primitiveTypeName(et) else ""
+val (preprocess, arrayData, arrayWriter) =
+  GenArrayData.getCodeArrayData(ctx, et, children.size, 
isPrimitiveArray, array)
+
+ev.copy(code =
+  preprocess +
   ctx.splitExpressions(
 ctx.INPUT_ROW,
-children.zipWithIndex.map { case (e, i) =>
-  val eval = e.genCode(ctx)
-  eval.code + s"""
-if (${eval.isNull}) {
-  $values[$i] = null;
+evals.zipWithIndex.map { case (eval, i) =>
+  eval.code +
+(if (isPrimitiveArray) {
+  (if (!children(i).nullable) {
+s"\n$arrayWriter.write($i, ${eval.value});"
+  } else {
+s"""
+if (${eval.isNull}) {
--- End diff --

What do you mean "manually optimize"? I naively generate the code without 
optimization. While I could optimize the code by eliminating the generation of 
`$arrayWriter.setNull$primitiveTypeName($i);` if `${eval.isNull} == "false"`, I 
did not apply it here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-13 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13909#discussion_r92141735
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -56,33 +58,93 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
   }
 
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
-val arrayClass = classOf[GenericArrayData].getName
-val values = ctx.freshName("values")
-ctx.addMutableState("Object[]", values, s"this.$values = null;")
+val array = ctx.freshName("array")
 
-ev.copy(code = s"""
-  this.$values = new Object[${children.size}];""" +
+val et = dataType.elementType
+val evals = children.map(e => e.genCode(ctx))
+val isPrimitiveArray = ctx.isPrimitiveType(et)
+val primitiveTypeName = if (isPrimitiveArray) 
ctx.primitiveTypeName(et) else ""
+val (preprocess, arrayData, arrayWriter) =
+  GenArrayData.getCodeArrayData(ctx, et, children.size, 
isPrimitiveArray, array)
+
+ev.copy(code =
+  preprocess +
   ctx.splitExpressions(
 ctx.INPUT_ROW,
-children.zipWithIndex.map { case (e, i) =>
-  val eval = e.genCode(ctx)
-  eval.code + s"""
-if (${eval.isNull}) {
-  $values[$i] = null;
+evals.zipWithIndex.map { case (eval, i) =>
+  eval.code +
+(if (isPrimitiveArray) {
+  (if (!children(i).nullable) {
+s"\n$arrayWriter.write($i, ${eval.value});"
+  } else {
+s"""
+if (${eval.isNull}) {
+  $arrayWriter.setNull$primitiveTypeName($i);
+} else {
+  $arrayWriter.write($i, ${eval.value});
+}
+   """
+  })
 } else {
-  $values[$i] = ${eval.value};
-}
-   """
+  s"""
+  if (${eval.isNull}) {
+$array[$i] = null;
+  } else {
+$array[$i] = ${eval.value};
+  }
+ """
+})
 }) +
-  s"""
-final ArrayData ${ev.value} = new $arrayClass($values);
-this.$values = null;
-  """, isNull = "false")
+  s"\nfinal ArrayData ${ev.value} = $arrayData;\n",
+  isNull = "false")
   }
 
   override def prettyName: String = "array"
 }
 
+private [sql] object GenArrayData {
+  // This function returns Java code pieces based on DataType and 
isPrimitive
+  // for allocation of ArrayData class
+  def getCodeArrayData(
+  ctx: CodegenContext,
+  dt: DataType,
+  size: Int,
+  isPrimitive : Boolean,
+  array: String): (String, String, String) = {
+if (!isPrimitive) {
+  val arrayClass = classOf[GenericArrayData].getName
+  ctx.addMutableState("Object[]", array,
+s"this.$array = new Object[${size}];")
+  ("", s"new $arrayClass($array)", null)
+} else {
+  val holder = ctx.freshName("holder")
+  val arrayWriter = ctx.freshName("createArrayWriter")
+  val unsafeArrayClass = classOf[UnsafeArrayData].getName
+  val holderClass = classOf[BufferHolder].getName
+  val arrayWriterClass = classOf[UnsafeArrayWriter].getName
+  ctx.addMutableState(unsafeArrayClass, array, "")
+  ctx.addMutableState(holderClass, holder, "")
+  ctx.addMutableState(arrayWriterClass, arrayWriter, "")
+  val baseOffset = Platform.BYTE_ARRAY_OFFSET
+  val unsafeArraySizeInBytes =
+UnsafeArrayData.calculateHeaderPortionInBytes(size) +
+ByteArrayMethods.roundNumberOfBytesToNearestWord(dt.defaultSize * 
size)
+
+  (s"""
+$array = new $unsafeArrayClass();
+$holder = new $holderClass($unsafeArraySizeInBytes);
+$arrayWriter = new $arrayWriterClass();
--- End diff --

When  I did these things (more precisely allocation of `UnsafeArrayData`), 
it caused test failure.
Thus, I intentionally created `UnsafeArrayData` every time when we evaluate 
the expression with an input row.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,

[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15717
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70073/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15717
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-13 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15717
  
**[Test build #70073 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70073/consoleFull)**
 for PR 15717 at commit 
[`a71683c`](https://github.com/apache/spark/commit/a71683c773a1117f58c7e2b5cb5a4b1b101326f4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16104: [SPARK-18675][SQL] CTAS for hive serde table should work...

2016-12-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16104
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70075/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 >

301 - 400 of 472 matches

Mail list logo