[GitHub] spark issue #21033: [SPARK-19320][MESOS]allow specifying a hard limit on num...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/21033 any progress here ? @yanji84 @susanxhuynh --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21033: [SPARK-19320][MESOS]allow specifying a hard limit on num...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/21033 @yanji84 How do you identify the gpu if you have multiple gpus on the machine ? It would nice to have some docs for it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos
Github user jomach commented on the issue: https://github.com/apache/spark/pull/14644 We have some servers running 8 GPUs on mesos. I would like to run Spark on it but I need to be able from spark to allocate a GPU only per map phase. On Hadoop 3.0 you can do spark.yarn.executor.resource.yarn.io/gpu. I have a Spark job that receives a list of files to process, each map on spark should call a c script that reads a chunk of the list and process it on the gpu. For this I need that Spark recognizes the allocated gpu from Mesos like GPU0 is yours and of course mesos needs to mark that gpu as used. with this gpu.max this is not possible --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19485: [SPARK-20055] [Docs] Added documentation for load...
Github user jomach closed the pull request at: https://github.com/apache/spark/pull/19485 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 @gatorsmile will do --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 @gatorsmile: we will have a lot of duplication. Ist that Fine ? I will create a complete new Page like sql programming guide and name it Data sources guide and add all the data sources with all the options (and duplicating information from the api into the docs) ist that ok for all ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 Yes I'm viewing the docs with Jekyll. I addressed that on my previous comment. I really don't think we should make a huge example as the json does. It's a csv ... What do you think ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 So I removed the duplicated stuff and added the links. I do it on purpose not to add more example as the document is getting huge and hard to find stuff. What do you think ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 Ok so I will do: - Create a new Section for csv-datasets - add more example options on the code fromJavaSQLDataSourceExample.java (.scala .py and .r) - Make reference to the links from the api. This will have the effect that we will not see all the options on .md page and people will need to jump in to the api. Do you agree with this ? Cool would be if from jekyllrb we could create something like a iframe and get the options from the scala api... Any ideias ? Please net me know if it is ok to proceed this way. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees
Github user jomach commented on a diff in the pull request: https://github.com/apache/spark/pull/7842#discussion_r144641913 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala --- @@ -0,0 +1,261 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.pmml.export + +import scala.collection.mutable +import scala.collection.JavaConverters._ + +import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _} + --- End diff -- remove blank Line --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees
Github user jomach commented on a diff in the pull request: https://github.com/apache/spark/pull/7842#discussion_r144642103 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala --- @@ -0,0 +1,261 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.pmml.export + +import scala.collection.mutable +import scala.collection.JavaConverters._ + +import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _} + +import org.apache.spark.mllib.tree.configuration.{Algo, FeatureType} +import org.apache.spark.mllib.tree.configuration.Algo._ +import org.apache.spark.mllib.tree.model.{DecisionTreeModel, Node} + +private[mllib] object PMMLTreeModelUtils { + + val FieldNamePrefix = "field_" + + def toPMMLTree(dtModel: DecisionTreeModel, modelName: String): (TreeModel, List[DataField]) = { + +val miningFunctionType = dtModel.algo match { + case Algo.Classification => MiningFunctionType.CLASSIFICATION + case Algo.Regression => MiningFunctionType.REGRESSION +} + +val treeModel = new TreeModel() + .setModelName(modelName) + .setFunctionName(miningFunctionType) + .setSplitCharacteristic(TreeModel.SplitCharacteristic.BINARY_SPLIT) + +var (rootNode, miningFields, dataFields, classes) = buildStub(dtModel.topNode, dtModel.algo) + +// adding predicted classes for classification and target field for regression for completeness +dtModel.algo match { + + case Algo.Classification => +miningFields = miningFields :+ new MiningField() + .setName(FieldName.create("class")) + .setUsageType(FieldUsageType.PREDICTED) + +val dataField = new DataField() + .setName(FieldName.create("class")) + .setOpType(OpType.CATEGORICAL) + .addValues(classes: _*) + .setDataType(DataType.DOUBLE) + +dataFields = dataFields :+ dataField + + case Algo.Regression => +val targetField = FieldName.create("target") +val dataField = new DataField(targetField, OpType.CONTINUOUS, DataType.DOUBLE) +dataFields = dataFields :+ dataField + +miningFields = miningFields :+ new MiningField() + .setName(targetField) + .setUsageType(FieldUsageType.TARGET) + +} + +val miningSchema = new MiningSchema().addMiningFields(miningFields: _*) + +treeModel.setNode(rootNode).setMiningSchema(miningSchema) + +(treeModel, dataFields) + } + + /** Build a pmml tree stub given the root mllib node. */ + private def buildStub(rootDTNode: Node, algo: Algo): +(PMMLNode, List[MiningField], List[DataField], List[PMMLValue]) = { + +val miningFields = mutable.MutableList[MiningField]() +val dataFields = mutable.HashMap[String, DataField]() +val classes = mutable.MutableList[Double]() + +def buildStubInternal(rootNode: Node, predicate: Predicate): PMMLNode = { + + // get rootPMML node for the MLLib node + val rootPMMLNode = new PMMLNode() +.setId(rootNode.id.toString) +.setScore(rootNode.predict.predict.toString) +.setPredicate(predicate) + + var leftPredicate: Predicate = new True() + var rightPredicate: Predicate = new True() + + if (rootNode.split.isDefined) { +val fieldName = FieldName.create(FieldNamePrefix + rootNode.split.get.feature) +val dataField = getDataField(rootNode, fieldName).get + +if (dataFields.get(dataField.getName.getValue).isEmpty) { + dataFields.put(dataField.getName.getValue, dataField) + miningFields += new MiningField() +.setName(dataField.getName) +.setUsageType(FieldUsageType.
[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees
Github user jomach commented on a diff in the pull request: https://github.com/apache/spark/pull/7842#discussion_r144642031 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala --- @@ -0,0 +1,261 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.pmml.export + +import scala.collection.mutable +import scala.collection.JavaConverters._ + +import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _} + +import org.apache.spark.mllib.tree.configuration.{Algo, FeatureType} +import org.apache.spark.mllib.tree.configuration.Algo._ +import org.apache.spark.mllib.tree.model.{DecisionTreeModel, Node} + +private[mllib] object PMMLTreeModelUtils { + + val FieldNamePrefix = "field_" + + def toPMMLTree(dtModel: DecisionTreeModel, modelName: String): (TreeModel, List[DataField]) = { + +val miningFunctionType = dtModel.algo match { + case Algo.Classification => MiningFunctionType.CLASSIFICATION + case Algo.Regression => MiningFunctionType.REGRESSION +} + +val treeModel = new TreeModel() + .setModelName(modelName) + .setFunctionName(miningFunctionType) + .setSplitCharacteristic(TreeModel.SplitCharacteristic.BINARY_SPLIT) + +var (rootNode, miningFields, dataFields, classes) = buildStub(dtModel.topNode, dtModel.algo) + +// adding predicted classes for classification and target field for regression for completeness +dtModel.algo match { + + case Algo.Classification => +miningFields = miningFields :+ new MiningField() + .setName(FieldName.create("class")) + .setUsageType(FieldUsageType.PREDICTED) + +val dataField = new DataField() + .setName(FieldName.create("class")) + .setOpType(OpType.CATEGORICAL) + .addValues(classes: _*) + .setDataType(DataType.DOUBLE) + +dataFields = dataFields :+ dataField + + case Algo.Regression => +val targetField = FieldName.create("target") +val dataField = new DataField(targetField, OpType.CONTINUOUS, DataType.DOUBLE) +dataFields = dataFields :+ dataField + +miningFields = miningFields :+ new MiningField() + .setName(targetField) + .setUsageType(FieldUsageType.TARGET) + +} + +val miningSchema = new MiningSchema().addMiningFields(miningFields: _*) + +treeModel.setNode(rootNode).setMiningSchema(miningSchema) + +(treeModel, dataFields) + } + + /** Build a pmml tree stub given the root mllib node. */ + private def buildStub(rootDTNode: Node, algo: Algo): +(PMMLNode, List[MiningField], List[DataField], List[PMMLValue]) = { + +val miningFields = mutable.MutableList[MiningField]() +val dataFields = mutable.HashMap[String, DataField]() +val classes = mutable.MutableList[Double]() + +def buildStubInternal(rootNode: Node, predicate: Predicate): PMMLNode = { + + // get rootPMML node for the MLLib node + val rootPMMLNode = new PMMLNode() +.setId(rootNode.id.toString) +.setScore(rootNode.predict.predict.toString) +.setPredicate(predicate) + + var leftPredicate: Predicate = new True() + var rightPredicate: Predicate = new True() + + if (rootNode.split.isDefined) { +val fieldName = FieldName.create(FieldNamePrefix + rootNode.split.get.feature) +val dataField = getDataField(rootNode, fieldName).get + +if (dataFields.get(dataField.getName.getValue).isEmpty) { + dataFields.put(dataField.getName.getValue, dataField) + miningFields += new MiningField() +.setName(dataField.getName) +.setUsageType(FieldUsageType.
[GitHub] spark pull request #7842: [SPARK-8542][MLlib]PMML export for Decision Trees
Github user jomach commented on a diff in the pull request: https://github.com/apache/spark/pull/7842#discussion_r144642055 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/pmml/export/PMMLTreeModelUtils.scala --- @@ -0,0 +1,261 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.pmml.export + +import scala.collection.mutable +import scala.collection.JavaConverters._ + +import org.dmg.pmml.{Node => PMMLNode, Value => PMMLValue, _} + +import org.apache.spark.mllib.tree.configuration.{Algo, FeatureType} +import org.apache.spark.mllib.tree.configuration.Algo._ +import org.apache.spark.mllib.tree.model.{DecisionTreeModel, Node} + +private[mllib] object PMMLTreeModelUtils { + + val FieldNamePrefix = "field_" + + def toPMMLTree(dtModel: DecisionTreeModel, modelName: String): (TreeModel, List[DataField]) = { + +val miningFunctionType = dtModel.algo match { + case Algo.Classification => MiningFunctionType.CLASSIFICATION + case Algo.Regression => MiningFunctionType.REGRESSION +} + +val treeModel = new TreeModel() + .setModelName(modelName) + .setFunctionName(miningFunctionType) + .setSplitCharacteristic(TreeModel.SplitCharacteristic.BINARY_SPLIT) + +var (rootNode, miningFields, dataFields, classes) = buildStub(dtModel.topNode, dtModel.algo) + +// adding predicted classes for classification and target field for regression for completeness +dtModel.algo match { + + case Algo.Classification => +miningFields = miningFields :+ new MiningField() + .setName(FieldName.create("class")) + .setUsageType(FieldUsageType.PREDICTED) + +val dataField = new DataField() + .setName(FieldName.create("class")) + .setOpType(OpType.CATEGORICAL) + .addValues(classes: _*) + .setDataType(DataType.DOUBLE) + +dataFields = dataFields :+ dataField + + case Algo.Regression => +val targetField = FieldName.create("target") +val dataField = new DataField(targetField, OpType.CONTINUOUS, DataType.DOUBLE) +dataFields = dataFields :+ dataField + +miningFields = miningFields :+ new MiningField() + .setName(targetField) + .setUsageType(FieldUsageType.TARGET) + +} + +val miningSchema = new MiningSchema().addMiningFields(miningFields: _*) + +treeModel.setNode(rootNode).setMiningSchema(miningSchema) + +(treeModel, dataFields) + } + + /** Build a pmml tree stub given the root mllib node. */ + private def buildStub(rootDTNode: Node, algo: Algo): +(PMMLNode, List[MiningField], List[DataField], List[PMMLValue]) = { + +val miningFields = mutable.MutableList[MiningField]() +val dataFields = mutable.HashMap[String, DataField]() +val classes = mutable.MutableList[Double]() + +def buildStubInternal(rootNode: Node, predicate: Predicate): PMMLNode = { + + // get rootPMML node for the MLLib node + val rootPMMLNode = new PMMLNode() +.setId(rootNode.id.toString) +.setScore(rootNode.predict.predict.toString) +.setPredicate(predicate) + + var leftPredicate: Predicate = new True() + var rightPredicate: Predicate = new True() + + if (rootNode.split.isDefined) { +val fieldName = FieldName.create(FieldNamePrefix + rootNode.split.get.feature) +val dataField = getDataField(rootNode, fieldName).get + +if (dataFields.get(dataField.getName.getValue).isEmpty) { + dataFields.put(dataField.getName.getValue, dataField) + miningFields += new MiningField() +.setName(dataField.getName) +.setUsageType(FieldUsageType.
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 @HyukjinKwon I came up with this. What do you think ? What I don't like on it is that I did not find anyway to read Javadocs into the markdown so that we don't have duplicates. Any ideia or should we leave it as in this PR ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 Yes I will do it. give me some days please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19485: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19485 @HyukjinKwon Here is the enter as the other is closed / merged --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19485: [SPARK-20055] [Docs] Added documentation for load...
GitHub user jomach opened a pull request: https://github.com/apache/spark/pull/19485 [SPARK-20055] [Docs] Added documentation for loading csv files into DataFrames Fix ## What changes were proposed in this pull request? Small rendering fix ## How was this patch tested? Reviewers You can merge this pull request into a Git repository by running: $ git pull https://github.com/jomach/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19485.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19485 commit f5941bf196a36afe8715d713fcaaf3f1a136d9e8 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-04T13:09:16Z SPARK-20055 Documentation -Added documentation for loading csv files into Dataframes commit 812bdf7a44ed2e52c7012921814da6bb73d0033c Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-04T12:58:44Z SPARK-20055 Documentation - Some examples on how to create a dataframe with a csv file (cherry picked from commit e8ca1dc) commit 4e4a02ba271bfb9811d31cd1909c942be4322682 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-04T13:09:16Z SPARK-20055 Documentation -Added documentation for loading csv files into Dataframes commit a2ec38a7b86b9cf89f7f4b9cf6368b9864ef10c2 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-05T08:27:20Z SPARK-20055 Documentation - Some examples on how to create a dataframe with a csv file (cherry picked from commit a546421) commit 793628bbedcc50c0845a3fd999d2720e2c63ea1d Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-05T08:32:15Z Merge remote-tracking branch 'origin/master' # Conflicts: # docs/sql-programming-guide.md # examples/src/main/java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java # examples/src/main/r/RSparkSQLExample.R commit cd69fa240d453a7b8344796349a2bf03a20ffbfc Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-10T05:52:37Z SPARK-20055 Documentation - Some examples on how to create a dataframe with a csv file commit 68799ede999ec1874c80d242441032cd29a2f695 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-11T07:29:33Z SPARK-20055 Documentation - PR comments commit 7ff1d84779acc50ab3c63d9bc0651ac53193f555 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-11T08:09:49Z SPARK-20055 Documentation - PR comments commit 07d73fcac85529fa17e34b170f2941f0f579fe00 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-12T15:12:35Z Merge branch 'upstream/masterlocal' commit 73b1d7aed4c0fd740d5fbdde569d6b3ff3b86271 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-12T16:03:51Z SPARK-20055 Documentation - PR comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19429: [SPARK-20055] [Docs] Added documentation for load...
Github user jomach commented on a diff in the pull request: https://github.com/apache/spark/pull/19429#discussion_r144321507 --- Diff: docs/sql-programming-guide.md --- @@ -479,6 +481,26 @@ source type can be converted into other types using this syntax. +To load a CSV file you can use: + + + +{% include_example manual_load_options_csv scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %} + + + +{% include_example manual_load_options_csv java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %} + + + +{% include_example manual_load_options_csv python/sql/datasource.py %} + + + +{% include_example manual_load_options_csv r/RSparkSQLExample.R %} + + + ### Run SQL on files directly --- End diff -- @HyukjinKwon should I add a new line between line 503 and 504 ? For example : ``` {% include_example generic_load_save_functions r/RSparkSQLExample.R %} ### Manually Specifying Options ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19429 @gatorsmile pr comments fixed. The problem with the actual docs is that people wen start with spark usually don't start with JSON files but with CSV files to "see" something --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19429 @gatorsmile pr comments fixed. Sorry but is my first time. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19429 @gatorsmile I dressed your comments. Still I cannot use the jekyll build... `SKIP_API=1 jekyll build --incremental Configuration file: /Users/jorge/Downloads/spark/docs/_config.yml Deprecation: The 'gems' configuration option has been renamed to 'plugins'. Please update your config file accordingly. Source: /Users/jorge/Downloads/spark/docs Destination: /Users/jorge/Downloads/spark/docs/_site Incremental build: enabled Generating... Liquid Exception: invalid byte sequence in US-ASCII in _layouts/redirect.html ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19429 @felixcheung Sorry for that. Should be there now. Can you test ? thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19429: [SPARK-20055] [Docs] Added documentation for load...
GitHub user jomach opened a pull request: https://github.com/apache/spark/pull/19429 [SPARK-20055] [Docs] Added documentation for loading csv files into DataFrames ## What changes were proposed in this pull request? Added documentation for loading csv files into Dataframes ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jomach/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19429.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19429 commit f5941bf196a36afe8715d713fcaaf3f1a136d9e8 Author: Jorge Machado <jorge.w.mach...@hotmail.com> Date: 2017-10-04T13:09:16Z SPARK-20055 Documentation -Added documentation for loading csv files into Dataframes --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org