GitHub user aglinxinyuan edited a discussion: Starter tasks for new contributors

Hi everyone,

I went through the open issue backlog and tagged a fresh batch of 
beginner-friendly issues with the `starter-task` label — **21 newly labeled**, 
bringing the total to **28 open starter tasks**. Each one is self-contained and 
clearly scoped, picked so a new contributor can land a real change without 
first needing deep knowledge of the Amber engine or the distributed runtime.

If you're new to Texera and looking for a first contribution, this is a great 
place to start.

**Browse them all:** https://github.com/apache/texera/labels/starter-task

## What's available

**Unit-test coverage** — add a focused spec for a single class or component 
(the gentlest on-ramp):
- #5664 — `MapOpDesc`
- #5663 — `DeadLetterMonitorActor`
- #5662 — `HeaderField`
- #5661 — `URLFetchUtil`
- #5474 — `VisualizationFrameContentComponent`

**Small bug fixes** — contained bugs with a clear fix location:
- #5666 — add a timeout and retry to `DatasetFileDocument` file-service requests
- #5042 — `LogicalLink` is not JSON round-trippable for `OperatorIdentity` 
fields
- #3546 — schema-missing error
- #3524 — handle optional properties of an operator (enum validator)
- #3497 — the description of read-only workflows is still editable

**Frontend / UI polish** — small, visible improvements:
- #3588 — fix the page height of the Admin page
- #3375 — improve the JVM-memory slider UI in the computing unit
- #3406 — option to sort the workflows list by last execution time

**Small features** — bounded enhancements:
- #4348 — auto-submit a dataset version when exporting results to a dataset
- #4315 — auto-install the latest package version when none is specified 
(Python venv)
- #4314 — allow uploading a `requirements.txt` when creating a Python venv
- #1956 — add a same-input-schema constraint to Intersect/Union operators

## How to pick one up

1. Comment `/take` on the issue (on its own line) to self-assign it — `/untake` 
releases it if your plans change.
2. Fork the repo and follow the [contributing 
guide](https://github.com/apache/texera/blob/main/CONTRIBUTING.md) for the 
fork-based workflow, Conventional-Commit PR titles, and how to run the backend 
(`sbt test`) and frontend (`ng test`) tests.
3. Open a PR with `Closes #<issue>` in the description.

Questions are very welcome — ask right on the issue or reply here, and a 
committer will help you scope it. Looking forward to your first PR!


---

## Update — June 18, 2026

Added **7 more starter tasks**, and closed two earlier items (#4119, #4319) 
that turned out to be already resolved. The live list is always at the 
[`starter-task` label](https://github.com/apache/texera/labels/starter-task).

**More unit-test coverage** — same gentle on-ramp (a focused spec for one 
class/component, no production-code changes):
- #5776 — `PortIdentityKeySerializer` / `PortIdentityKeyDeserializer` (the 
`id_internal` map-key serde + round-trip)
- #5777 — `DistributedAggregation` (the init/iterate/merge/finalAgg contract)
- #5778 — `TextGenCodegen` (Hugging Face `text-generation` payload/parse 
codegen)
- #5779 — `SessionUser` (`Principal` delegation + `isRoleOf`)
- #5780 — `AdminGuardService` (frontend route guard — admins allowed, others 
redirected)

**Small bug fix:**
- #3842 — switch single-file dataset downloads to use the browser's native 
download UI

**Small feature:**
- #3142 — support `COUNT(*)` in the Aggregate operator's COUNT function


GitHub link: https://github.com/apache/texera/discussions/5701

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to