BenComer98 opened a new issue, #4106:
URL: https://github.com/apache/texera/issues/4106

   ### What happened?
   
   **Release**: 1.1.0
   **Platform**: Windows, using Powershell
   **UI**: Google Chrome
   
   **Background**: I'm trying to use a User Defined Function to define a 
Priority Score for a dataset I'm using. To do so, I'm using a standard CSV File 
scan to pull in data from a csv. I've attached the CSV as a dataset, I'm 
pulling it through a UDF that adds a field "priority_score," and I'm sorting in 
descending order by that score. The execution unit crashes and produces the 
error shown in the log section.
   
   I have attached the files I used to produce this error. I tried using a 
smaller, made-up-for-this-issue csv, but it didn't error out.
   
   
[appointments.csv](https://github.com/user-attachments/files/23945601/appointments.csv)
 <- CSV
   
   [Issue 
Repro.json](https://github.com/user-attachments/files/23945603/Issue.Repro.json)
 <- Exported Issue repro
   
   ### How to reproduce?
   
   1. Add a dataset containing appointments.csv to Texera.
   2. Add a Workflow. In that Workflow, add a CSV File Scan, a Python UDF 
(normal, no variation), and a Sort, and connect them in that order.
   3. In the CSV File Scan, select the appointments.csv file with headers on.
   4. In the Python UDF, add a new field called "priority_score" with type 
double and paste the following:
   `from pytexera import *
   from datetime import date
   import math
   
   class ProcessTupleOperator(UDFOperatorV2):
       
       @overrides
       def process_tuple(self, tuple_: Tuple, port: int) -> 
Iterator[Optional[TupleLike]]:
           appointment_date = tuple_["appointment_datetime"].date()
   
           today = date.today()
           days_until = (appointment_date - today).days
           if days_until >= 0:
               tuple_["priority_score"] = 1 / math.log(days_until + 2) + 1
           else:
               tuple_["priority_score"] = 0.0
   
           # Output the tuple
           yield tuple_
   
   `
   5. In the Sort function, sort by priority_score DESC.
   6. Add a Computing Unit and Run. Your backend will throw errors.
   
   ### Version
   
   1.1.0-incubating (Pre-release/Master)
   
   ### Commit Hash (Optional)
   
   _No response_
   
   ### What browsers are you seeing the problem on?
   
   _No response_
   
   ### Relevant log output
   
   ```shell
   texera-computing-unit-master            | [2025-12-05 00:08:11,439] [WARN] 
[edu.uci.ics.amber.util.ArrowUtils$] [flight-server-default-executor-0] - 
Caught error during parsing Arrow value back to Texera value
   texera-computing-unit-master            | 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$AttributeTypeException: Failed 
to parse type java.time.LocalDateTime to Timestamp: 2026-09-05T01:01:12
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$$anonfun$parseTimestamp$2.applyOrElse(AttributeTypeUtils.scala:219)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$$anonfun$parseTimestamp$2.applyOrElse(AttributeTypeUtils.scala:215)
   texera-computing-unit-master            |       at 
scala.util.Failure.recover(Try.scala:233)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$.parseTimestamp(AttributeTypeUtils.scala:215)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$.parseField(AttributeTypeUtils.scala:128)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.util.ArrowUtils$.$anonfun$getTexeraTuple$1(ArrowUtils.scala:82)
   texera-computing-unit-master            |       at 
scala.collection.StrictOptimizedIterableOps.map(StrictOptimizedIterableOps.scala:100)
   texera-computing-unit-master            |       at 
scala.collection.StrictOptimizedIterableOps.map$(StrictOptimizedIterableOps.scala:87)
   texera-computing-unit-master            |       at 
scala.collection.convert.JavaCollectionWrappers$JCollectionWrapper.map(JavaCollectionWrappers.scala:98)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.util.ArrowUtils$.getTexeraTuple(ArrowUtils.scala:77)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.engine.architecture.pythonworker.AmberProducer.$anonfun$acceptPut$2(PythonProxyServer.scala:143)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.engine.architecture.pythonworker.AmberProducer.$anonfun$acceptPut$2$adapted(PythonProxyServer.scala:142)
   texera-computing-unit-master            |       at 
scala.collection.immutable.Range.foreach(Range.scala:190)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.engine.architecture.pythonworker.AmberProducer.$anonfun$acceptPut$1(PythonProxyServer.scala:142)
   texera-computing-unit-master            |       at 
org.apache.arrow.flight.FlightService.lambda$doPutCustom$0(FlightService.java:233)
   texera-computing-unit-master            |       at 
io.grpc.Context$1.run(Context.java:566)
   texera-computing-unit-master            |       at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
   texera-computing-unit-master            |       at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
   texera-computing-unit-master            |       at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   texera-computing-unit-master            |       at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   texera-computing-unit-master            |       at 
java.base/java.lang.Thread.run(Thread.java:829)
   texera-computing-unit-master            | Caused by: 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$AttributeTypeException: 
Unsupported type for parsing to Timestamp: java.time.LocalDateTime
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$.$anonfun$parseTimestamp$1(AttributeTypeUtils.scala:209)
   texera-computing-unit-master            |       at 
scala.util.Try$.apply(Try.scala:210)
   texera-computing-unit-master            |       at 
edu.uci.ics.amber.core.tuple.AttributeTypeUtils$.parseTimestamp(AttributeTypeUtils.scala:202)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to