I want to use Apache Spark for working with text data. There are some Russian
symbols but Apache Spark shows me strings which look like as
"...\u0413\u041e\u0420\u041e...". What should I do for correcting them.
--
View this message in context:
Hello!
I want to use another dir instaed of /tmp directory for all stuff...
I set spark.local.dir and -Djava.io.tmpdir=/... but I see that Spark uses
/tmp for some data...
What does Spark do? And what should I do my Spark uses only my directories?
Thank you!
--
View this message in context:
Hello everybody,
I want to work with DataFrames where some columns have a string type. And
there are russian letters.
Russian letters are incorrect in the text. Could you help me how I should
work with them?
Thanks.
--
View this message in context:
I get the error in the apache spark...
"spark.driver.memory 60g
spark.python.worker.memory 60g
spark.master local[*]"
The amount of data is about 5Gb, but spark says that "GC overhead limit
exceeded". I guess that my conf-file gives enought resources.
"16/05/16 15:13:02 WARN
Hello,
I have the same problem... Sometimes I get the error: "Py4JError: Answer
from Java side is empty"
Sometimes my code works fine but sometimes not...
Did you find why it might come? What was the reason?
Thanks.
--
View this message in context:
Hello.
I'm sorry but did you find the answer?
I have the similar error and I can not solve it... No one answered me...
Spark driver dies and I get the error "Answer from Java side is empty".
I thought that it is so because I made a mistake this conf-file
I use Sparkling Water 1.6.3, Spark
I use Sparkling Water 1.6.3, Spark 1.6.I use Java Oracle 8 or
OpenJDK-7:(every time I get this error when I transform Spark DataFrame into
H2O DataFrame. Spark cluster dies..):ERROR:py4j.java_gateway:Error while
sending or receiving.Traceback (most recent call last): File
Hello all,
I use a string when I'm launching the Sparkling-Water:
"--conf
spark.driver.extraClassPath='/SQLDrivers/sqljdbc_4.2/enu/sqljdbc41.jar"
and I get the error:
"
---
TypeError Traceback
I get this error.
Who knows what does it mean?
Py4JJavaError: An error occurred while calling
z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure:
Exception while getting task result:
I wrote in "spark-defaults.conf" spark.driver.extraClassPath '/dir'
or "PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook"
/.../sparkling-water-1.6.1/bin/pysparkling \ --conf
spark.driver.extraClassPath='/.../sqljdbc41.jar'
Nothing works
--
View this message in context:
I get an error while I form a dataframe from the parquet file:
Py4JJavaError: An error occurred while calling
z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure:
Exception while getting task result:
Hello, I've started to use Spark 1.6.1 before I used Spark 1.5.
I included the string export
SPARK_CLASSPATH="/SQLDrivers/sqljdbc_4.2/enu/sqljdbc41.jar" when I launched
pysparkling and it worked well.
But in version 1.6.1 there is an error that it's deprecated and I had to use
Hello!
I work with SqlContext, I create a query to MS Sql Server and get data...
Spark says to me that I have to install hive...
I have started to use Spark 1.6.1 (before I used Spark 1.5 and I have never
heard about this necessity early)...
Py4JJavaError: An error occurred while calling
Hello all,
I try to use some sql functions.
My task to renumber rows in DataFrame.
I use sql functions but they don't work and I don;t understand why.
I would appreciate you help to fix this issue.
Thank you!
The piece of my code:
"from pyspark.sql.functions import row_number, percent_rank, rank,
I have a string spark.driver.maxResultSize=0 in the spark-defaults.conf.
But I get an error:
"org.apache.spark.SparkException: Job aborted due to stage failure: Total
size of serialized results of 18 tasks (1070.5 MB) is bigger than
spark.driver.maxResultSize (1024.0 MB)"
But if I write --conf
Hello everybody,
I use Python API and Scala API. I read data without problem with Python API:
"sqlContext = SQLContext(sc)
data_full = sqlContext.read.parquet("---")"
But when I use Scala:
"val sqlContext = new SQLContext(sc)
val data_full = sqlContext.read.parquet("---")"
I get the error (I
Hello!
I want to use Scala from Jupyter (or may be something else if you could
recomend anything. I mean an IDE). Does anyone know how I can do this?
Thank you!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Scala-from-Jupyter-tp26234.html
Sent from the
17 matches
Mail list logo