Works for me tooyou are a life-saver :)
But the question: should/how we report this to Azure team?
On Fri, May 12, 2017 at 10:32 AM, Denny Lee wrote:
> I was able to repro your issue when I had downloaded the jars via blob but
> when I downloaded them as raw, I was
Hi Spark Users,
I want to store Enum type (such as Vehicle Type: Car, SUV, Wagon) in my
data. My storage format will be parquet and I need to access the data from
Spark-shell, Spark SQL CLI, and hive. My questions:
1) Should I store my Enum type as String or store it as numeric encoding
(aka
I was able to repro your issue when I had downloaded the jars via blob but
when I downloaded them as raw, I was able to get everything up and
running. For example:
wget https://github.com/Azure/azure-documentdb-spark/*blob*
Hey all,
I’ve found myself in a position where I need to do a relatively large matrix
multiply (at least, compared to what I normally have to do). I’m looking to
multiply a 100k by 500k dense matrix by its transpose to yield 100k by 100k
matrix. I’m trying to do this on Google Cloud, so I
Rick,
Thank you for the input. Now space issue is resolved.
yarn.nodemanager.local.dirs and yarn.nodemanager.log.dirs was filling up.
For 5Gb of data why it should take 10 mins to load with 7-8 executors with 2
cores and I also see all the executors memory is upto 7-20 GB
If 5 GB of data takes
Use the official mailing list archive
http://mail-archives.apache.org/mod_mbox/spark-user/201705.mbox/%3ccajyeq0gh1fbhbajb9gghognhqouogydba28lnn262hfzzgf...@mail.gmail.com%3e
On Thu, May 11, 2017 at 2:50 PM, lucas.g...@gmail.com
wrote:
> Also, and this is unrelated to the
Might want to try to use gzip as opposed to parquet. The only way i
ever reliably got parquet to work on S3 is by using Alluxio as a
buffer, but it's a decent amount of work.
On Thu, May 11, 2017 at 11:50 AM, lucas.g...@gmail.com
wrote:
> Also, and this is unrelated to the
Also, and this is unrelated to the actual question... Why don't these
messages show up in the archive?
http://apache-spark-user-list.1001560.n3.nabble.com/
Ideally I'd want to post a link to our internal wiki for these questions,
but can't find them in the archive.
On 11 May 2017 at 07:16,
I would try to track down the "no space left on device" - find out where
that originates from, since you should be able to allocate 10 executors
with 4 cores and 15GB RAM each quite easily. In that case,you may want to
increase overhead, so yarn doesn't kill your executors.
Check that no local
Hi,
I am reading a Hive Orc table into memory, StorageLevel is set to
(StorageLevel.MEMORY_AND_DISK_SER)
Total size of the Hive table is 5GB
Started the spark-shell as below
spark-shell --master yarn --deploy-mode client --num-executors 8
--driver-memory 5G --executor-memory 7G
I realized that in the Spark ML, BinaryClassifcationMetrics only supports
AreaUnderPR and AreaUnderROC. Why is that? I
What if I need other metrics such as F-score, accuracy? I tried to use
MulticlassClassificationEvaluator to evaluate other metrics such as
Accuracy for a binary classification
Looks like this isn't viable in spark 2.0.0 (and greater I presume). I'm
pretty sure I came across this blog and ignored it due to that.
Any other thoughts? The linked tickets in:
https://issues.apache.org/jira/browse/SPARK-10063
https://issues.apache.org/jira/browse/HADOOP-13786
Hello David,
Let me make it more clear;
* There is not any spark installed on windows laptop, just the intellij and
the related dependencies.
* SparkLauncher is good starting point for submitting a job programatically
but i am not sure if my current problem is related with job
Hi
Thanks for reply, but unfortunately did not work. I am getting same error.
sshuser@ed0-svochd:~/azure-spark-docdb-test$ spark-shell --jars
azure-documentdb-spark-0.0.3-SNAPSHOT.jar,azure-documentdb-1.10.0.jar
SPARK_MAJOR_VERSION is set to 2, using Spark2
Setting default log level to "WARN".
I am trying to convert avro records with field type = bytes to json string
using Structured Streaming in Spark 2.1. Please see below.
object AvroConvert {
case class KafkaMessage(
payload: String
)
val schemaString ="""{
15 matches
Mail list logo