[GitHub] spark issue #21033: [SPARK-19320][MESOS]allow specifying a hard limit on num...

2018-11-13 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/21033
  
any progress here ? @yanji84  @susanxhuynh 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21033: [SPARK-19320][MESOS]allow specifying a hard limit on num...

2018-11-12 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/21033
  
@yanji84  How do you identify the gpu if you have multiple gpus on the 
machine ? It would nice to have some docs for it. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos

2018-11-12 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/14644
  
We have some servers running 8 GPUs on mesos. I would like to run Spark on 
it but I need to be able from spark to allocate a GPU only per map phase. On 
Hadoop 3.0 you can do spark.yarn.executor.resource.yarn.io/gpu. I have a Spark 
job that receives a list of files to process, each map on spark should call a c 
script that reads  a chunk of the list and process it on the gpu. For this I 
need that Spark recognizes the allocated gpu from Mesos like GPU0 is yours and 
of course mesos needs to mark that gpu as used. with this gpu.max this is not 
possible


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19485: [SPARK-20055] [Docs] Added documentation for load...

2017-10-18 Thread jomach
Github user jomach closed the pull request at:

https://github.com/apache/spark/pull/19485


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-18 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
@gatorsmile will do


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-18 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
@gatorsmile: we will have a lot of duplication.

Ist that Fine ? I will create a complete new Page like sql programming 
guide and name it Data sources guide and add all the data sources with all the 
options (and duplicating information from the api into the docs) ist that ok 
for all ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-17 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
Yes I'm viewing the  docs with Jekyll.  I addressed that  on my previous 
comment. I really don't think we should make a huge example as the json does. 
It's a csv ... 

What do you think ? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-15 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
So I removed the duplicated stuff and added the links. I do it on purpose 
not to add more example as the document is getting huge and hard to find stuff. 
What do you think ? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-14 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
Ok so I will do: 
  - Create a new Section for csv-datasets
  - add more  example options on the code fromJavaSQLDataSourceExample.java 
(.scala .py and .r)
  - Make reference to the links from the api. 

This will have the effect that we will not see all the options on .md page 
and people will need to jump in to the api. Do you agree with this ? 

Cool would be if from jekyllrb we could create something like a iframe and 
get the options from the scala api... Any ideias ? 

Please net me know if it is ok to proceed this way.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees

2017-10-13 Thread jomach
Github user jomach commented on a diff in the pull request:

https://github.com/apache/spark/pull/7842#discussion_r144641913
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala
 ---
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.pmml.export
+
+import scala.collection.mutable
+import scala.collection.JavaConverters._
+
+import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _}
+
--- End diff --

remove blank Line


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees

2017-10-13 Thread jomach
Github user jomach commented on a diff in the pull request:

https://github.com/apache/spark/pull/7842#discussion_r144642103
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala
 ---
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.pmml.export
+
+import scala.collection.mutable
+import scala.collection.JavaConverters._
+
+import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _}
+
+import org.apache.spark.mllib.tree.configuration.{Algo, FeatureType}
+import org.apache.spark.mllib.tree.configuration.Algo._
+import org.apache.spark.mllib.tree.model.{DecisionTreeModel, Node}
+
+private[mllib] object PMMLTreeModelUtils {
+
+  val FieldNamePrefix = "field_"
+
+  def toPMMLTree(dtModel: DecisionTreeModel, modelName: String): 
(TreeModel, List[DataField]) = {
+
+val miningFunctionType = dtModel.algo match {
+  case Algo.Classification => MiningFunctionType.CLASSIFICATION
+  case Algo.Regression => MiningFunctionType.REGRESSION
+}
+
+val treeModel = new TreeModel()
+  .setModelName(modelName)
+  .setFunctionName(miningFunctionType)
+  .setSplitCharacteristic(TreeModel.SplitCharacteristic.BINARY_SPLIT)
+
+var (rootNode, miningFields, dataFields, classes) = 
buildStub(dtModel.topNode, dtModel.algo)
+
+// adding predicted classes for classification and target field for 
regression for completeness
+dtModel.algo match {
+
+  case Algo.Classification =>
+miningFields = miningFields :+ new MiningField()
+  .setName(FieldName.create("class"))
+  .setUsageType(FieldUsageType.PREDICTED)
+
+val dataField = new DataField()
+  .setName(FieldName.create("class"))
+  .setOpType(OpType.CATEGORICAL)
+  .addValues(classes: _*)
+  .setDataType(DataType.DOUBLE)
+
+dataFields = dataFields :+ dataField
+
+  case Algo.Regression =>
+val targetField = FieldName.create("target")
+val dataField = new DataField(targetField, OpType.CONTINUOUS, 
DataType.DOUBLE)
+dataFields = dataFields :+ dataField
+
+miningFields = miningFields :+ new MiningField()
+  .setName(targetField)
+  .setUsageType(FieldUsageType.TARGET)
+
+}
+
+val miningSchema = new MiningSchema().addMiningFields(miningFields: _*)
+
+treeModel.setNode(rootNode).setMiningSchema(miningSchema)
+
+(treeModel, dataFields)
+  }
+
+  /** Build a pmml tree stub given the root mllib node. */
+  private def buildStub(rootDTNode: Node, algo: Algo):
+(PMMLNode, List[MiningField], List[DataField], List[PMMLValue]) = {
+
+val miningFields = mutable.MutableList[MiningField]()
+val dataFields = mutable.HashMap[String, DataField]()
+val classes = mutable.MutableList[Double]()
+
+def buildStubInternal(rootNode: Node, predicate: Predicate): PMMLNode 
= {
+
+  // get rootPMML node for the MLLib node
+  val rootPMMLNode = new PMMLNode()
+.setId(rootNode.id.toString)
+.setScore(rootNode.predict.predict.toString)
+.setPredicate(predicate)
+
+  var leftPredicate: Predicate = new True()
+  var rightPredicate: Predicate = new True()
+
+  if (rootNode.split.isDefined) {
+val fieldName = FieldName.create(FieldNamePrefix + 
rootNode.split.get.feature)
+val dataField = getDataField(rootNode, fieldName).get
+
+if (dataFields.get(dataField.getName.getValue).isEmpty) {
+  dataFields.put(dataField.getName.getValue, dataField)
+  miningFields += new MiningField()
+.setName(dataField.getName)
+.setUsageType(FieldUsageType.

[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees

2017-10-13 Thread jomach
Github user jomach commented on a diff in the pull request:

https://github.com/apache/spark/pull/7842#discussion_r144642031
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala
 ---
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.pmml.export
+
+import scala.collection.mutable
+import scala.collection.JavaConverters._
+
+import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _}
+
+import org.apache.spark.mllib.tree.configuration.{Algo, FeatureType}
+import org.apache.spark.mllib.tree.configuration.Algo._
+import org.apache.spark.mllib.tree.model.{DecisionTreeModel, Node}
+
+private[mllib] object PMMLTreeModelUtils {
+
+  val FieldNamePrefix = "field_"
+
+  def toPMMLTree(dtModel: DecisionTreeModel, modelName: String): 
(TreeModel, List[DataField]) = {
+
+val miningFunctionType = dtModel.algo match {
+  case Algo.Classification => MiningFunctionType.CLASSIFICATION
+  case Algo.Regression => MiningFunctionType.REGRESSION
+}
+
+val treeModel = new TreeModel()
+  .setModelName(modelName)
+  .setFunctionName(miningFunctionType)
+  .setSplitCharacteristic(TreeModel.SplitCharacteristic.BINARY_SPLIT)
+
+var (rootNode, miningFields, dataFields, classes) = 
buildStub(dtModel.topNode, dtModel.algo)
+
+// adding predicted classes for classification and target field for 
regression for completeness
+dtModel.algo match {
+
+  case Algo.Classification =>
+miningFields = miningFields :+ new MiningField()
+  .setName(FieldName.create("class"))
+  .setUsageType(FieldUsageType.PREDICTED)
+
+val dataField = new DataField()
+  .setName(FieldName.create("class"))
+  .setOpType(OpType.CATEGORICAL)
+  .addValues(classes: _*)
+  .setDataType(DataType.DOUBLE)
+
+dataFields = dataFields :+ dataField
+
+  case Algo.Regression =>
+val targetField = FieldName.create("target")
+val dataField = new DataField(targetField, OpType.CONTINUOUS, 
DataType.DOUBLE)
+dataFields = dataFields :+ dataField
+
+miningFields = miningFields :+ new MiningField()
+  .setName(targetField)
+  .setUsageType(FieldUsageType.TARGET)
+
+}
+
+val miningSchema = new MiningSchema().addMiningFields(miningFields: _*)
+
+treeModel.setNode(rootNode).setMiningSchema(miningSchema)
+
+(treeModel, dataFields)
+  }
+
+  /** Build a pmml tree stub given the root mllib node. */
+  private def buildStub(rootDTNode: Node, algo: Algo):
+(PMMLNode, List[MiningField], List[DataField], List[PMMLValue]) = {
+
+val miningFields = mutable.MutableList[MiningField]()
+val dataFields = mutable.HashMap[String, DataField]()
+val classes = mutable.MutableList[Double]()
+
+def buildStubInternal(rootNode: Node, predicate: Predicate): PMMLNode 
= {
+
+  // get rootPMML node for the MLLib node
+  val rootPMMLNode = new PMMLNode()
+.setId(rootNode.id.toString)
+.setScore(rootNode.predict.predict.toString)
+.setPredicate(predicate)
+
+  var leftPredicate: Predicate = new True()
+  var rightPredicate: Predicate = new True()
+
+  if (rootNode.split.isDefined) {
+val fieldName = FieldName.create(FieldNamePrefix + 
rootNode.split.get.feature)
+val dataField = getDataField(rootNode, fieldName).get
+
+if (dataFields.get(dataField.getName.getValue).isEmpty) {
+  dataFields.put(dataField.getName.getValue, dataField)
+  miningFields += new MiningField()
+.setName(dataField.getName)
+.setUsageType(FieldUsageType.

[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees

2017-10-13 Thread jomach
Github user jomach commented on a diff in the pull request:

https://github.com/apache/spark/pull/7842#discussion_r144642055
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala
 ---
@@ -0,0 +1,261 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.pmml.export
+
+import scala.collection.mutable
+import scala.collection.JavaConverters._
+
+import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _}
+
+import org.apache.spark.mllib.tree.configuration.{Algo, FeatureType}
+import org.apache.spark.mllib.tree.configuration.Algo._
+import org.apache.spark.mllib.tree.model.{DecisionTreeModel, Node}
+
+private[mllib] object PMMLTreeModelUtils {
+
+  val FieldNamePrefix = "field_"
+
+  def toPMMLTree(dtModel: DecisionTreeModel, modelName: String): 
(TreeModel, List[DataField]) = {
+
+val miningFunctionType = dtModel.algo match {
+  case Algo.Classification => MiningFunctionType.CLASSIFICATION
+  case Algo.Regression => MiningFunctionType.REGRESSION
+}
+
+val treeModel = new TreeModel()
+  .setModelName(modelName)
+  .setFunctionName(miningFunctionType)
+  .setSplitCharacteristic(TreeModel.SplitCharacteristic.BINARY_SPLIT)
+
+var (rootNode, miningFields, dataFields, classes) = 
buildStub(dtModel.topNode, dtModel.algo)
+
+// adding predicted classes for classification and target field for 
regression for completeness
+dtModel.algo match {
+
+  case Algo.Classification =>
+miningFields = miningFields :+ new MiningField()
+  .setName(FieldName.create("class"))
+  .setUsageType(FieldUsageType.PREDICTED)
+
+val dataField = new DataField()
+  .setName(FieldName.create("class"))
+  .setOpType(OpType.CATEGORICAL)
+  .addValues(classes: _*)
+  .setDataType(DataType.DOUBLE)
+
+dataFields = dataFields :+ dataField
+
+  case Algo.Regression =>
+val targetField = FieldName.create("target")
+val dataField = new DataField(targetField, OpType.CONTINUOUS, 
DataType.DOUBLE)
+dataFields = dataFields :+ dataField
+
+miningFields = miningFields :+ new MiningField()
+  .setName(targetField)
+  .setUsageType(FieldUsageType.TARGET)
+
+}
+
+val miningSchema = new MiningSchema().addMiningFields(miningFields: _*)
+
+treeModel.setNode(rootNode).setMiningSchema(miningSchema)
+
+(treeModel, dataFields)
+  }
+
+  /** Build a pmml tree stub given the root mllib node. */
+  private def buildStub(rootDTNode: Node, algo: Algo):
+(PMMLNode, List[MiningField], List[DataField], List[PMMLValue]) = {
+
+val miningFields = mutable.MutableList[MiningField]()
+val dataFields = mutable.HashMap[String, DataField]()
+val classes = mutable.MutableList[Double]()
+
+def buildStubInternal(rootNode: Node, predicate: Predicate): PMMLNode 
= {
+
+  // get rootPMML node for the MLLib node
+  val rootPMMLNode = new PMMLNode()
+.setId(rootNode.id.toString)
+.setScore(rootNode.predict.predict.toString)
+.setPredicate(predicate)
+
+  var leftPredicate: Predicate = new True()
+  var rightPredicate: Predicate = new True()
+
+  if (rootNode.split.isDefined) {
+val fieldName = FieldName.create(FieldNamePrefix + 
rootNode.split.get.feature)
+val dataField = getDataField(rootNode, fieldName).get
+
+if (dataFields.get(dataField.getName.getValue).isEmpty) {
+  dataFields.put(dataField.getName.getValue, dataField)
+  miningFields += new MiningField()
+.setName(dataField.getName)
+.setUsageType(FieldUsageType.

[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-13 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
@HyukjinKwon I came up with this. What do you think ? What I don't like on 
it is that I did not find anyway to read Javadocs into the markdown so that we 
don't have duplicates. Any ideia or should we leave it as in this PR ? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-12 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
Yes I will do it. give me some days please. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-12 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19485
  
@HyukjinKwon  Here is the enter as the other is closed / merged


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19485: [SPARK-20055] [Docs] Added documentation for load...

2017-10-12 Thread jomach
GitHub user jomach opened a pull request:

https://github.com/apache/spark/pull/19485

[SPARK-20055] [Docs] Added documentation for loading csv files into 
DataFrames Fix

## What changes were proposed in this pull request?

Small  rendering fix

## How was this patch tested?
 Reviewers


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jomach/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19485.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19485


commit f5941bf196a36afe8715d713fcaaf3f1a136d9e8
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-04T13:09:16Z

SPARK-20055 Documentation
 -Added documentation for loading csv files into Dataframes

commit 812bdf7a44ed2e52c7012921814da6bb73d0033c
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-04T12:58:44Z

SPARK-20055 Documentation
 - Some examples on how to create a dataframe with a csv file

(cherry picked from commit e8ca1dc)

commit 4e4a02ba271bfb9811d31cd1909c942be4322682
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-04T13:09:16Z

SPARK-20055 Documentation
 -Added documentation for loading csv files into Dataframes

commit a2ec38a7b86b9cf89f7f4b9cf6368b9864ef10c2
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-05T08:27:20Z

SPARK-20055 Documentation
 - Some examples on how to create a dataframe with a csv file

(cherry picked from commit a546421)

commit 793628bbedcc50c0845a3fd999d2720e2c63ea1d
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-05T08:32:15Z

Merge remote-tracking branch 'origin/master'

# Conflicts:
#   docs/sql-programming-guide.md
#   
examples/src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java
#   examples/src/main/r/RSparkSQLExample.R

commit cd69fa240d453a7b8344796349a2bf03a20ffbfc
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-10T05:52:37Z

SPARK-20055 Documentation
 - Some examples on how to create a dataframe with a csv file

commit 68799ede999ec1874c80d242441032cd29a2f695
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-11T07:29:33Z

SPARK-20055 Documentation
 - PR comments

commit 7ff1d84779acc50ab3c63d9bc0651ac53193f555
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-11T08:09:49Z

SPARK-20055 Documentation
 - PR comments

commit 07d73fcac85529fa17e34b170f2941f0f579fe00
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-12T15:12:35Z

Merge branch 'upstream/masterlocal'

commit 73b1d7aed4c0fd740d5fbdde569d6b3ff3b86271
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-12T16:03:51Z

SPARK-20055 Documentation
 - PR comments




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19429: [SPARK-20055] [Docs] Added documentation for load...

2017-10-12 Thread jomach
Github user jomach commented on a diff in the pull request:

https://github.com/apache/spark/pull/19429#discussion_r144321507
  
--- Diff: docs/sql-programming-guide.md ---
@@ -479,6 +481,26 @@ source type can be converted into other types using 
this syntax.
 
 
 
+To load a CSV file you can use:
+
+
+
+{% include_example manual_load_options_csv 
scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %}
+
+
+
+{% include_example manual_load_options_csv 
java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %}
+
+
+
+{% include_example manual_load_options_csv python/sql/datasource.py %}
+
+
+
+{% include_example manual_load_options_csv r/RSparkSQLExample.R %}
+
+
+
 ### Run SQL on files directly
--- End diff --

@HyukjinKwon  should I add a new line between line 503 and 504 ? 
For example : 
```
{% include_example generic_load_save_functions r/RSparkSQLExample.R %}




### Manually Specifying Options
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-11 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19429
  
@gatorsmile pr comments fixed.  The problem with the actual docs is that 
people wen start with spark usually don't start with JSON files but with CSV 
files to "see" something 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-11 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19429
  
@gatorsmile pr comments fixed. Sorry but is my first time.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-09 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19429
  
@gatorsmile  I dressed your comments. Still I cannot use the jekyll build...
`SKIP_API=1 jekyll build --incremental
Configuration file: /Users/jorge/Downloads/spark/docs/_config.yml
   Deprecation: The 'gems' configuration option has been renamed to 
'plugins'. Please update your config file accordingly.
Source: /Users/jorge/Downloads/spark/docs
   Destination: /Users/jorge/Downloads/spark/docs/_site
 Incremental build: enabled
  Generating... 
  Liquid Exception: invalid byte sequence in US-ASCII in 
_layouts/redirect.html
`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-05 Thread jomach
Github user jomach commented on the issue:

https://github.com/apache/spark/pull/19429
  
@felixcheung  Sorry for that. Should be there now. Can you test ? thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19429: [SPARK-20055] [Docs] Added documentation for load...

2017-10-04 Thread jomach
GitHub user jomach opened a pull request:

https://github.com/apache/spark/pull/19429

[SPARK-20055] [Docs] Added documentation for loading csv files into 
DataFrames

 

## What changes were proposed in this pull request?

 Added documentation for loading csv files into Dataframes

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jomach/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19429.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19429


commit f5941bf196a36afe8715d713fcaaf3f1a136d9e8
Author: Jorge Machado <jorge.w.mach...@hotmail.com>
Date:   2017-10-04T13:09:16Z

SPARK-20055 Documentation
 -Added documentation for loading csv files into Dataframes




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org