[jira] [Commented] (SPARK-26748) CLONE - Autoencoder

2019-01-28 Thread Chris Bogan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753891#comment-16753891
 ] 

Chris Bogan commented on SPARK-26748:
-

My mistake I am terribly sorry

> CLONE - Autoencoder
> ---
>
> Key: SPARK-26748
> URL: https://issues.apache.org/jira/browse/SPARK-26748
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 1.5.0
>Reporter: Chris Bogan
>Assignee: Alexander Ulanov
>Priority: Major
>
> Goal: Implement various types of autoencoders 
> Requirements:
> 1)Basic (deep) autoencoder that supports different types of inputs: binary, 
> real in [0..1]. real in [-inf, +inf] 
> 2)Sparse autoencoder i.e. L1 regularization. It should be added as a feature 
> to the MLP and then used here 
> 3)Denoising autoencoder 
> 4)Stacked autoencoder for pre-training of deep networks. It should support 
> arbitrary network layers
> References: 
> 1. Vincent, Pascal, et al. "Extracting and composing robust features with 
> denoising autoencoders." Proceedings of the 25th international conference on 
> Machine learning. ACM, 2008. 
> http://www.iro.umontreal.ca/~vincentp/Publications/denoising_autoencoders_tr1316.pdf
>  
> 2. 
> http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Rifai_455.pdf, 
> 3. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. 
> (2010). Stacked denoising autoencoders: Learning useful representations in a 
> deep network with a local denoising criterion. Journal of Machine Learning 
> Research, 11(3371–3408). 
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.3484=rep1=pdf
> 4, 5, 6. Bengio, Yoshua, et al. "Greedy layer-wise training of deep 
> networks." Advances in neural information processing systems 19 (2007): 153. 
> http://www.iro.umontreal.ca/~lisa/pointeurs/dbn_supervised_tr1282.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26748) CLONE - Autoencoder

2019-01-28 Thread Chris Bogan (JIRA)
Chris Bogan created SPARK-26748:
---

 Summary: CLONE - Autoencoder
 Key: SPARK-26748
 URL: https://issues.apache.org/jira/browse/SPARK-26748
 Project: Spark
  Issue Type: Improvement
  Components: ML
Affects Versions: 1.5.0
Reporter: Chris Bogan
Assignee: Alexander Ulanov


Goal: Implement various types of autoencoders 
Requirements:
1)Basic (deep) autoencoder that supports different types of inputs: binary, 
real in [0..1]. real in [-inf, +inf] 
2)Sparse autoencoder i.e. L1 regularization. It should be added as a feature to 
the MLP and then used here 
3)Denoising autoencoder 
4)Stacked autoencoder for pre-training of deep networks. It should support 
arbitrary network layers


References: 
1. Vincent, Pascal, et al. "Extracting and composing robust features with 
denoising autoencoders." Proceedings of the 25th international conference on 
Machine learning. ACM, 2008. 
http://www.iro.umontreal.ca/~vincentp/Publications/denoising_autoencoders_tr1316.pdf
 
2. http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Rifai_455.pdf, 
3. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. 
(2010). Stacked denoising autoencoders: Learning useful representations in a 
deep network with a local denoising criterion. Journal of Machine Learning 
Research, 11(3371–3408). 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.3484=rep1=pdf
4, 5, 6. Bengio, Yoshua, et al. "Greedy layer-wise training of deep networks." 
Advances in neural information processing systems 19 (2007): 153. 
http://www.iro.umontreal.ca/~lisa/pointeurs/dbn_supervised_tr1282.pdf



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26731) remove EOLed spark jobs from jenkins

2019-01-26 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26731:

Attachment: LICENSE

> remove EOLed spark jobs from jenkins
> 
>
> Key: SPARK-26731
> URL: https://issues.apache.org/jira/browse/SPARK-26731
> Project: Spark
>  Issue Type: Task
>  Components: Build
>Affects Versions: 1.6.3, 2.0.2, 2.1.3
>Reporter: shane knapp
>Assignee: shane knapp
>Priority: Major
> Attachments: LICENSE, activemq-cli-tools-4a984ec.tar.gz
>
>
> i will disable, but not remove (yet), the branch-specific builds for 1.6, 2.0 
> and 2.1 on jenkins.
> these include all test builds, as well as docs, lint, compile, and packaging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26722) add SPARK_TEST_KEY=1 to pull request builder and spark-master-test-sbt-hadoop-2.7

2019-01-26 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26722:

Attachment: SPARK-26731.doc

> add SPARK_TEST_KEY=1 to pull request builder and 
> spark-master-test-sbt-hadoop-2.7
> -
>
> Key: SPARK-26722
> URL: https://issues.apache.org/jira/browse/SPARK-26722
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: shane knapp
>Assignee: shane knapp
>Priority: Major
> Attachments: SPARK-26731.doc
>
>
> from https://github.com/apache/spark/pull/23117:
> we need to add the {{SPARK_TEST_KEY=1}} env var to both the GHPRB and 
> {{spark-master-test-sbt-hadoop-2.7}} builds.
> this is done for the PRB, and was manually added to the 
> {{spark-master-test-sbt-hadoop-2.7}} build.
> i will leave this open until i finish porting the JJB configs in to the main 
> spark repo (for the {{spark-master-test-sbt-hadoop-2.7}} build).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-26731) remove EOLed spark jobs from jenkins

2019-01-26 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26731:

Comment: was deleted

(was: {code:java}
// code placeholder
{code}
$ cd infrastructure/content/dev/ $ $EDITOR infra-site.mdtext $ svn commit ... 
wait a short while for the page to be rebuilt ... ... **ONLY IF** you are an 
ASF Member, then publish: ... $ curl -sL http://s.apache.org/cms-cli | perl})

> remove EOLed spark jobs from jenkins
> 
>
> Key: SPARK-26731
> URL: https://issues.apache.org/jira/browse/SPARK-26731
> Project: Spark
>  Issue Type: Task
>  Components: Build
>Affects Versions: 1.6.3, 2.0.2, 2.1.3
>Reporter: shane knapp
>Assignee: shane knapp
>Priority: Major
> Attachments: LICENSE, activemq-cli-tools-4a984ec.tar.gz
>
>
> i will disable, but not remove (yet), the branch-specific builds for 1.6, 2.0 
> and 2.1 on jenkins.
> these include all test builds, as well as docs, lint, compile, and packaging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26731) remove EOLed spark jobs from jenkins

2019-01-26 Thread Chris Bogan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752988#comment-16752988
 ] 

Chris Bogan commented on SPARK-26731:
-

{code:java}
// code placeholder
{code}
$ cd infrastructure/content/dev/ $ $EDITOR infra-site.mdtext $ svn commit ... 
wait a short while for the page to be rebuilt ... ... **ONLY IF** you are an 
ASF Member, then publish: ... $ curl -sL http://s.apache.org/cms-cli | perl}

> remove EOLed spark jobs from jenkins
> 
>
> Key: SPARK-26731
> URL: https://issues.apache.org/jira/browse/SPARK-26731
> Project: Spark
>  Issue Type: Task
>  Components: Build
>Affects Versions: 1.6.3, 2.0.2, 2.1.3
>Reporter: shane knapp
>Assignee: shane knapp
>Priority: Major
> Attachments: LICENSE, activemq-cli-tools-4a984ec.tar.gz
>
>
> i will disable, but not remove (yet), the branch-specific builds for 1.6, 2.0 
> and 2.1 on jenkins.
> these include all test builds, as well as docs, lint, compile, and packaging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26327) Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set

2019-01-26 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26327:

Attachment: request_handler_interface.json

> Metrics in FileSourceScanExec not update correctly while 
> relation.partitionSchema is set
> 
>
> Key: SPARK-26327
> URL: https://issues.apache.org/jira/browse/SPARK-26327
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuanjian Li
>Assignee: Yuanjian Li
>Priority: Major
> Fix For: 2.2.3, 2.3.3, 2.4.1, 3.0.0
>
> Attachments: Homepage - Material Design, 
> apache-opennlp-1.9.1-bin.tar.gz, request_handler_interface.json
>
>
> As currently approach in `FileSourceScanExec`, the metrics of "numFiles" and 
> "metadataTime"(fileListingTime) were updated while lazy val 
> `selectedPartitions` initialized in the scenario of relation.partitionSchema 
> is set. But `selectedPartitions` will be initialized by `metadata` at first, 
> which is called by `queryExecution.toString` in 
> `SQLExecution.withNewExecutionId`. So while the 
> `SQLMetrics.postDriverMetricUpdates` called, there's no corresponding 
> liveExecutions in SQLAppStatusListener, the metrics update is not work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26731) remove EOLed spark jobs from jenkins

2019-01-26 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26731:

Attachment: activemq-cli-tools-4a984ec.tar.gz

> remove EOLed spark jobs from jenkins
> 
>
> Key: SPARK-26731
> URL: https://issues.apache.org/jira/browse/SPARK-26731
> Project: Spark
>  Issue Type: Task
>  Components: Build
>Affects Versions: 1.6.3, 2.0.2, 2.1.3
>Reporter: shane knapp
>Assignee: shane knapp
>Priority: Major
> Attachments: LICENSE, activemq-cli-tools-4a984ec.tar.gz
>
>
> i will disable, but not remove (yet), the branch-specific builds for 1.6, 2.0 
> and 2.1 on jenkins.
> these include all test builds, as well as docs, lint, compile, and packaging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26327) Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set

2019-01-26 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26327:

Attachment: apache-opennlp-1.9.1-bin.tar.gz

> Metrics in FileSourceScanExec not update correctly while 
> relation.partitionSchema is set
> 
>
> Key: SPARK-26327
> URL: https://issues.apache.org/jira/browse/SPARK-26327
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuanjian Li
>Assignee: Yuanjian Li
>Priority: Major
> Fix For: 2.2.3, 2.3.3, 2.4.1, 3.0.0
>
> Attachments: Homepage - Material Design, 
> apache-opennlp-1.9.1-bin.tar.gz
>
>
> As currently approach in `FileSourceScanExec`, the metrics of "numFiles" and 
> "metadataTime"(fileListingTime) were updated while lazy val 
> `selectedPartitions` initialized in the scenario of relation.partitionSchema 
> is set. But `selectedPartitions` will be initialized by `metadata` at first, 
> which is called by `queryExecution.toString` in 
> `SQLExecution.withNewExecutionId`. So while the 
> `SQLMetrics.postDriverMetricUpdates` called, there's no corresponding 
> liveExecutions in SQLAppStatusListener, the metrics update is not work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26327) Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set

2019-01-26 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26327:

Attachment: Homepage - Material Design

> Metrics in FileSourceScanExec not update correctly while 
> relation.partitionSchema is set
> 
>
> Key: SPARK-26327
> URL: https://issues.apache.org/jira/browse/SPARK-26327
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuanjian Li
>Assignee: Yuanjian Li
>Priority: Major
> Fix For: 2.2.3, 2.3.3, 2.4.1, 3.0.0
>
> Attachments: Homepage - Material Design
>
>
> As currently approach in `FileSourceScanExec`, the metrics of "numFiles" and 
> "metadataTime"(fileListingTime) were updated while lazy val 
> `selectedPartitions` initialized in the scenario of relation.partitionSchema 
> is set. But `selectedPartitions` will be initialized by `metadata` at first, 
> which is called by `queryExecution.toString` in 
> `SQLExecution.withNewExecutionId`. So while the 
> `SQLMetrics.postDriverMetricUpdates` called, there's no corresponding 
> liveExecutions in SQLAppStatusListener, the metrics update is not work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26518) UI Application Info Race Condition Can Throw NoSuchElement

2019-01-15 Thread Chris Bogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Bogan updated SPARK-26518:

Attachment: 15476091405552344590691778159589.jpg

> UI Application Info Race Condition Can Throw NoSuchElement
> --
>
> Key: SPARK-26518
> URL: https://issues.apache.org/jira/browse/SPARK-26518
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0, 2.4.0
>Reporter: Russell Spitzer
>Priority: Trivial
> Attachments: 15476091405552344590691778159589.jpg
>
>
> There is a slight race condition in the 
> [AppStatusStore|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala#L39]
> Which calls `next` on the returned store even if it is empty which i can be 
> for a short period of time after the UI is up but before the store is 
> populated.
> {code}
> 
> 
> Error 500 Server Error
> 
> HTTP ERROR 500
> Problem accessing /jobs/. Reason:
> Server ErrorCaused 
> by:java.util.NoSuchElementException
> at java.util.Collections$EmptyIterator.next(Collections.java:4189)
> at 
> org.apache.spark.util.kvstore.InMemoryStore$InMemoryIterator.next(InMemoryStore.java:281)
> at 
> org.apache.spark.status.AppStatusStore.applicationInfo(AppStatusStore.scala:38)
> at org.apache.spark.ui.jobs.AllJobsPage.render(AllJobsPage.scala:275)
> at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:86)
> at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:86)
> at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
> at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
> at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:535)
> at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
> at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317)
> at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
> at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
> at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
> at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219)
> at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
> at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:724)
> at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
> at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> at org.spark_project.jetty.server.Server.handle(Server.java:531)
> at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:352)
> at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
> at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
> at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:102)
> at 
> org.spark_project.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
> at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762)
> at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org