Re: Rust sync meeting

2021-04-07 Thread Mike Seddon
been looking at the > settings and I'm not sure how I can change that. > > Regarding the timing of the call, I am open to moving it if we can find a > time that accommodates everyone reasonably well. I think we have regular > attendance from the US, UK, Europe, South Africa

Re: Rust sync meeting

2021-03-24 Thread Mike Seddon
Hi Jorge, Can you please confirm the starting time (and timezone) and correct Google Meet link of the Rust and DataFusion sync calls? I missed today's due to daylight saving time changes - which is going to make Sydney time even harder. Thanks Mike On Thu, Mar 25, 2021 at 3:15 AM Jorge Cardoso L

Re: [Rust][DataFusion] Supporting input_file_name()

2021-02-25 Thread Mike Seddon
> Are you using it for file statistics? > > On Thu, Feb 25, 2021 at 9:36 AM Mike Seddon wrote: > > > Hi Fernando, > > > > After Andrew's reply I have moved the filename metadata into the Schema > and > > actually changed the ScalarFunctionImplementation sign

Re: [Rust][DataFusion] Supporting input_file_name()

2021-02-25 Thread Mike Seddon
RecordBatch::try_new(Arc::new(schema.clone()), vec![Arc::new(a), > > Arc::new(b)]).unwrap(); > > > > > let stream = TcpStream::connect("127.0.0.1:8000").unwrap(); > > let mut writer = StreamWriter::try_new(stream, &schema).unwrap(); > > writer.write(

Re: [Rust][DataFusion] Supporting input_file_name()

2021-02-24 Thread Mike Seddon
d guess it is likely possible "hoist" > metadata from a record batch schema object to the Message but understand if > it isn't something you want to pursue. > > On Wed, Feb 24, 2021 at 8:19 PM Mike Seddon wrote: > >> Hi Micah, >> Thank you for providing thi

Re: [Rust][DataFusion] Supporting input_file_name()

2021-02-24 Thread Mike Seddon
l on this list. Once there is consensus we can > formally vote and merge the change. > > [1] > https://github.com/apache/arrow/blob/master/docs/source/format/Columnar.rst > > On Wed, Feb 24, 2021 at 3:47 PM Mike Seddon wrote: > > > Thanks for both of your comments. >

Re: [Rust][DataFusion] Supporting input_file_name()

2021-02-24 Thread Mike Seddon
the RecordBatch > > > > > > > https://docs.rs/arrow/3.0.0/arrow/datatypes/struct.Schema.html#method.new_with_metadata > > > > On Wed, Feb 24, 2021 at 1:20 AM Mike Seddon wrote: > > > > > Hi, > > > > > > One of Apache Spark's very useful SQL func

[Rust][DataFusion] Supporting input_file_name()

2021-02-23 Thread Mike Seddon
Hi, One of Apache Spark's very useful SQL functions is the 'input_file_name' SQL function which provides a simple API for identifying the source of a row of data when sourced from a file-based source like Parquet or CSV. This is particularly useful for identifying which chunk/partition of a Parque

DataFusion Postgres License Requirements

2021-02-22 Thread Mike Seddon
Hi, The DataFusion project (an in-memory SQL engine built upon Arrow in Rust) has decided to adopt the Postgres dialect of SQL. The Postgres 'dialect' largely refers to the functions/API that Postgres has added in addition to meeting the ANSI SQL standard functions as all dialects have slightly di

Re: [Rust] [DataFusion] Target-typing for string literal scalars in queries

2021-02-11 Thread Mike Seddon
Hi guys, Sorry I forgot to reply to this too. I have done some work on the coercion recently: https://github.com/apache/arrow/blob/b799b662f19050270df4f8d67c6fec5fb7492172/rust/datafusion/src/physical_plan/type_coercion.rs#L167 I remember seeing a document with a 2 dimensional array showing the v

Re: [Rust] DataFusion TPCH benchmark overview

2021-02-03 Thread Mike Seddon
Hi Daniël, I am working on 22 as part of https://github.com/apache/arrow/pull/9243 We also need to convert all the Float64 schema types to Decimal(n). Cheers, Mike On Thu, Feb 4, 2021 at 5:44 AM Daniël Heres wrote: > Hey all, > > Quite some features have been added to DataFusion in the last c