Hello,

I have a proposal for a small improvement in the Datasource API and I'd like to 
know if it sounds like a change the Spark project would accept.

Currently, the `.save` method in DataFrameWriter will fail if the dataframe is 
bucketed and/or sorted. This makes sense, since there is no way of storing 
metadata in the current file-based data sources to know whether a file was 
bucketed or not.

I have a use case where I would like to implement a new, file-based data source 
which could keep track of that kind of metadata (without using the 
HiveMetastore), so I would like to be able to `.save` bucketed dataframes.

Would a patch to extend the datasource api with an indicator of whether that 
source is able to serialize bucketed dataframes be a welcome addition? I'm 
happy to work on it if that's the case.

I have opened this as https://issues.apache.org/jira/browse/SPARK-26160 in the 
Spark Jira.

Cheers,
Ximo.

________________________________

Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede 
contener informaci?n privilegiada o confidencial y es para uso exclusivo de la 
persona o entidad de destino. Si no es usted. el destinatario indicado, queda 
notificado de que la lectura, utilizaci?n, divulgaci?n y/o copia sin 
autorizaci?n puede estar prohibida en virtud de la legislaci?n vigente. Si ha 
recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente 
por esta misma v?a y proceda a su destrucci?n.

The information contained in this transmission is privileged and confidential 
information intended only for the use of the individual or entity named above. 
If the reader of this message is not the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this communication 
is strictly prohibited. If you have received this transmission in error, do not 
read it. Please immediately reply to the sender that you have received this 
communication in error and then delete it.

Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinat?rio, pode 
conter informa??o privilegiada ou confidencial e ? para uso exclusivo da pessoa 
ou entidade de destino. Se n?o ? vossa senhoria o destinat?rio indicado, fica 
notificado de que a leitura, utiliza??o, divulga??o e/ou c?pia sem autoriza??o 
pode estar proibida em virtude da legisla??o vigente. Se recebeu esta mensagem 
por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e 
proceda a sua destrui??o

Reply via email to