xglv1985 opened a new issue, #6912:
URL: https://github.com/apache/kyuubi/issues/6912

   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [x] I have searched in the 
[issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Describe the bug
   
   ## How to reproduce the issue?
   The changes in this PR will avoid a wrong result when generating the 
instance of org.apache.kyuubi.plugin.lineage.Lineage, in the certain case as 
follows:
   step 1: create a temporary view from a file
   `CREATE OR REPLACE TEMPORARY VIEW temp_view
   (
    `a` STRING COMMENT '',
    `b` STRING COMMENT ''
   )
   USING csv OPTIONS(
       sep='\t',
       path='${sourceFile.path}'
   );`
   
   step 2: insert into a table by selecting from the temporary view in step 1
   `insert overwrite table test_db.test_table_from_dir
   SELECT
        `a`,
        `b`
   FROM temp_view`
   
   step 3: generate the lineage when executing the insert statement in step 2
   
   Then a **None** object of org.apache.kyuubi.plugin.lineage.Lineage will be 
generated. However, the correct value of it should be:
   `inputTables(List())
   outputTables(List(spark_catalog.test_db.test_table_from_dir))
   
columnLineage(List(ColumnLineage(spark_catalog.test_db.test_table_from_dir.a0,Set()),
 ColumnLineage(spark_catalog.test_db.test_table_from_dir.b0,Set())))`
   
   ## How is the issue introduced?
   Let's see the current code when getting the Lineage object by resolving a 
LogicalPlan object:
   <img width="694" alt="image" 
src="https://github.com/user-attachments/assets/65256a0d-320d-4271-968f-59eafb74de9f";
 />
   
   According to the above logic, a None 
org.apache.kyuubi.plugin.lineage.Lineage object will be generated due to 
"try-recover" self-protection, in this certain case.
   
   ## The consequence of this bug
   ### Unit Test Environment
   In Unit Test, when the code runs here a "None.get" exception will be raised:
   <img width="682" alt="image" 
src="https://github.com/user-attachments/assets/102dc9bd-294f-4b1e-b1c6-01b6fee50fed";
 />
   
   Here's the runtime exception stack:
   `None.get
   java.util.NoSuchElementException: None.get
        at scala.None$.get(Option.scala:529)
        at scala.None$.get(Option.scala:527)
        at 
org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.extractLineageWithoutExecuting(SparkSQLLineageParserHelperSuite.scala:1485)
        at 
org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.$anonfun$new$83(SparkSQLLineageParserHelperSuite.scala:1465)`
   ### Production Environment
   This Lineage object cannot be used in the production environment because it 
has a None value which lacks necessary lineage information.
   
   ### Affects Version(s)
   
   all versions
   
   ### Kyuubi Server Log Output
   
   ```logtalk
   unrelated
   ```
   
   ### Kyuubi Engine Log Output
   
   ```logtalk
   unrelated
   ```
   
   ### Kyuubi Server Configurations
   
   ```yaml
   unrelated
   ```
   
   ### Kyuubi Engine Configurations
   
   ```yaml
   unrelated
   ```
   
   ### Additional context
   
   I have already proposed a PR to this issue, please see:
   https://github.com/apache/kyuubi/pull/6911
   
   ### Are you willing to submit PR?
   
   - [x] Yes. I would be willing to submit a PR with guidance from the Kyuubi 
community to fix.
   - [ ] No. I cannot submit a PR at this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to