Hello,

My name is Mario Fernández and I´m a Big Data developer, I usually program in 
Apache Spark in Java, and we have a big problem to read properly a csv file. 
The issue is that:

When I want to read csv file, for instance, with semicolon delimiter, the 
dataframe take semicolon like delimiter ant that´s correct, but also take comma 
like delimiter and that´s the problem.
I check this problem in Apache Spark 2.10, 1.6.2 DataFrame and also in Apache 
Spark 2.11 2.0.2 Dataset, and troubles are the same.

dfFile1 = sqlContext.read()       .format("com.databricks.spark.csv")
                                  .schema(customSchema)
                                  .option("charset", "Cp1252")
                                  .option("header", "true")
                                  .option("delimiter", ";")
                                  .load(path);

When I read a csv file like that, dataFrame take like delimiter the yellow 
letters:
Number;Name;Surname;Category
129.363;Mathew, Thomas;Johnson;Centers Technician

And the comma between Mathew and Thomas, shouldn´t be take like delimiter.

I would like to know if that´s problem is a bug and you are going to correct or 
the way to read simply is like that.

Thank you so much in advance.

Kind Regards.




________________________________

AVISO DE CONFIDENCIALIDAD.
Este correo y la información contenida o adjunta al mismo es privada y 
confidencial y va dirigida exclusivamente a su destinatario. everis informa a 
quien pueda haber recibido este correo por error que contiene información 
confidencial cuyo uso, copia, reproducción o distribución está expresamente 
prohibida. Si no es Vd. el destinatario del mismo y recibe este correo por 
error, le rogamos lo ponga en conocimiento del emisor y proceda a su 
eliminación sin copiarlo, imprimirlo o utilizarlo de ningún modo.

CONFIDENTIALITY WARNING.
This message and the information contained in or attached to it are private and 
confidential and intended exclusively for the addressee. everis informs to whom 
it may receive it in error that it contains privileged information and its use, 
copy, reproduction or distribution is prohibited. If you are not an intended 
recipient of this E-mail, please notify the sender, delete it and do not read, 
act upon, print, disclose, copy, retain or redistribute any portion of this 
E-mail.

Reply via email to