Re: [Apertium-stuff] Question sus la desambigüizacion dins Apertium

2024-06-22 Thread Xavi Ivars
Aperitium spa-cat heavily uses the preferences system to choose features
between variants (and even different standards or language styles within
the same dialect) by doing exactly what Kevin and Tino propose.

The dictionary is tagged with the features, and then different modes apply
different cg files (see
https://github.com/apertium/apertium-cat/blob/master/apertium-cat.cat_valencia.prefs.rlx
and other similar files) that apply those preferences by default.
--
Xavi Ivars
< http://xavi.ivars.me >

El dc., 19 de juny 2024, 15:44, Kevin Brubeck Unhammer 
va escriure:

> > How can I define src_lengadocian as the variable that means the source
> > language is lengadocian ?
>
> Hm, it kind of depends. In general, if you use variables, you can do
>
> export AP_SETVAR=src_lengadocian
> echo mau o mal | apertium -d . oci-fra
>
> and that variable will be available to the CG as VAR:src_lengadocian
>
> If you put it in oci-fra.preferences.xml, it will also show up on the
> web like the Preferences d'estil button at
> https://beta.apertium.org/index.cat.html#?dir=cat-spa
>
> But maybe these source language differences actually *should* be kept as
> separate pipelines, and shown as different source languages in the
> language selector in the web UI? In that case, it might actually be
> simpler to not do variables at all, and just have a separate CG file
> with lengadocian rules that runs before the regular CG. So in your
> oci-fra_lengadocian mode in
> https://github.com/apertium/apertium-oci-fra/blob/master/modes.xml#L373
> instead of
>
>   
> 
>   
>   
> 
>   
>
> you would have the general automorf, but two CG disambiguator steps
>
>   
> 
>   
>   
> 
>   
>   
> 
>   
>
> and the first CG would just have a few rules for lengadocian-specific
> stuff.
>
>
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSoC 2024 mentors & admins must log in

2024-03-12 Thread Xavi Ivars
I just logged in as a mentor, and accepted the program rules.

Let me know if there's anything I should do


--
Xavi Ivars
< http://xavi.ivars.me >

El dt., 12 de març 2024, 7:37, Tino Didriksen  va
escriure:

> Yes, even if you already registered last year. We just got a warning that
> we only have 1 admin (me), even though I was sure we had 3. So,
> https://summerofcode.withgoogle.com/
>
> "Before you can add an Org Member who has participated in previous
> programs to your organization for 2024, they must first agree to the 2024
> Program Rules and Org Member agreement by logging into their GSoC dashboard
> and clicking the 2024 and expanding it to see the 2024 Terms."
>
> As soon as possible, or we'll be kicked out of the program.
>
> -- Tino Didriksen
>
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium in GSoC 2024?

2024-02-05 Thread Xavi Ivars
Today is the last day. Should we apply (or has anyone already applied)?


--
Xavi Ivars
< http://xavi.ivars.me >

El dc., 31 de gen. 2024, 23:57, Xavi Ivars  va
escriure:

> Kevin came up with quite a few ideas 😅
>
> I think we should go for it. I'm happy to mentor this year.
>
> Missatge de Kevin Brubeck Unhammer  del dia dv., 26 de
> gen. 2024 a les 10:41:
>
>> > [CC: -stuff and PMC]
>> >
>> > Should we apply for Google Summer of Code this year? Deadline Feb 6th.
>> >
>> > -- Tino Didriksen
>>
>> I'd be happy to mentor at least. Some projects that I personally would
>> love to see happen:
>>
>> * More dictionaries and language data! Whether from scratch or converting
>> sources
>> * Implement preferences in existing pairs
>> * Capitalisation handling in existing pairs
>> * Faster / more robust recursive transfer
>> * (alternatively / more experimental) CG-based transfer
>> * Language Server Protocol and/or better editor support
>>   (newly open sourced Zed editor supports tree-sitter …)
>> * WASM
>> * Nice dictionary UI for web (and generally fixing web papercuts)
>>
>> Also, not fully thought through yet, but I'd love some way of debugging
>> apertium-separable ("why did this rule apply/not apply"). I suppose any
>> tooling here would probably also help with regular dix.
>>
>> Also, is there anything that could make using our data in *other* tools
>> and systems easier?
>>
>> best regards,
>> Kevin
>>
>>
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium in GSoC 2024?

2024-01-31 Thread Xavi Ivars
Kevin came up with quite a few ideas 😅

I think we should go for it. I'm happy to mentor this year.

Missatge de Kevin Brubeck Unhammer  del dia dv., 26 de
gen. 2024 a les 10:41:

> > [CC: -stuff and PMC]
> >
> > Should we apply for Google Summer of Code this year? Deadline Feb 6th.
> >
> > -- Tino Didriksen
>
> I'd be happy to mentor at least. Some projects that I personally would
> love to see happen:
>
> * More dictionaries and language data! Whether from scratch or converting
> sources
> * Implement preferences in existing pairs
> * Capitalisation handling in existing pairs
> * Faster / more robust recursive transfer
> * (alternatively / more experimental) CG-based transfer
> * Language Server Protocol and/or better editor support
>   (newly open sourced Zed editor supports tree-sitter …)
> * WASM
> * Nice dictionary UI for web (and generally fixing web papercuts)
>
> Also, not fully thought through yet, but I'd love some way of debugging
> apertium-separable ("why did this rule apply/not apply"). I suppose any
> tooling here would probably also help with regular dix.
>
> Also, is there anything that could make using our data in *other* tools
> and systems easier?
>
> best regards,
> Kevin
>
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Changes to apertium-preprocess-transfer

2023-06-26 Thread Xavi Ivars
That's awesome!

Thanks Daniel!

I'm sure Hèctor (and everyone else working on oci-fra) will specially
appreciate it 😊

Missatge de Daniel Swanson  del dia dl., 26 de
juny 2023 a les 23:01:

> Greetings Apertiumers!
>
> I recently identified a way that apertium-preprocess-transfer was
> being rather inefficient and today I fixed it, so tomorrow you all
> should be able to update to apertium 3.9.4 and see some improved
> compile times for any pairs not using apertium-recursive, with
> speedups between 10x and 7000x faster on the files I tested.
>
> I'm writing this email to let you know that in the process
> apertium-preprocess-transfer lost the ability to report partial
> overlaps like the following:
>
> Warning at line 6867, column 4: Paths to rule 27 blocked by rule 24.
>
> And I just wanted to let you all know, in case someone was depending
> on those. To compensate, I added a check to apertium-lint which can
> report roughly the same information:
>
> Warning (overlapping-paths) on line 6852: The sequence [preadv
> vblex.pp n.*] matches both this rule and the rule on line 6628.
>
> Daniel, who is trying to get better at not doing things that
> potentially break people's workflows without telling them
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSoC 2023 Mentors & Ideas?

2023-01-25 Thread Xavi Ivars
Me too, I'll mentor.

I think we need to do some cleanup on the ideas. TBH, I don't know where we
are in some areas like website development, or Python API, and if the work
that's remaining is worth a GSoC project

Missatge de Kevin Brubeck Unhammer  del dia dt., 24 de
gen. 2023 a les 14:37:

> I'll mentor :)
>
> --
> Kevin Brubeck Unhammer
>
> > GSoC 2023 org application is open, but do we have mentors for this year?
> > Please report in if you want to mentor.
> >
> > And as every year, please review
> > https://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code -
> > add/remove/amend ideas.
> >
> > -- Tino Didriksen
> >
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> >
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Old Catalan morphological analyser

2022-12-28 Thread Xavi Ivars
That seems to be it... This should probably be migrated into a (new?)
monolingual package in github, or into apertium-cat with a new variant

Missatge de Daniel Swanson  del dia dc., 28 de
des. 2022 a les 23:12:

> Is this what you're looking for?
>
>
> https://sourceforge.net/p/apertium/svn/HEAD/tree/incubator/apertium-oldca-XX/
>
> Daniel
>
> On Wed, Dec 28, 2022 at 11:27 AM Mikel L. Forcada  wrote:
> >
> > Dear Apertiumers,
> >
> > I have searched but I haven't been able to recover the Apertium
> > dictionary for old Catalan, which IIRC, was ported to Apertium format
> > from the old InterNOSTRUM format. Does anyone know where I can find it?
> > I've searched SourceForge too.
> >
> > Thanks a million!
> >
> > MIkel
> >
> >
> > --
> > Mikel L. Forcada
> > Dept. de Llenguatges i Sistemes Informàtics
> > Edifici Politècnica IV,
> > Universitat d'Alacant
> > E-03690 Sant Vicent del Raspeig (Spain)
> >
> > Phone: +34 96 590 9776
> > m...@dlsi.ua.es
> >
> >
> >
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Fwd: [GSoC Mentors] GSoC 2023 open for org applications January 23 - February 7

2022-12-07 Thread Xavi Ivars
, we look forward to our 19th year in 2023!

Best,

Stephanie, Romina and Perry

-- 
You received this message because you are subscribed to the Google Groups
"Google Summer of Code Mentors List" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to google-summer-of-code-mentors-list+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/google-summer-of-code-mentors-list/c9306914-a709-4e43-becd-468a1065ac29n%40googlegroups.com
<https://groups.google.com/d/msgid/google-summer-of-code-mentors-list/c9306914-a709-4e43-becd-468a1065ac29n%40googlegroups.com?utm_medium=email&utm_source=footer>
.


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Trouble with apertium-python

2022-08-26 Thread Xavi Ivars
Sorry for taking so long to reply.

I've been trying to reproduce this, but I can't even install the package.
When I try to install apertium-python using pipenv, it fails.

It seems that this fix [1] never got to be published into Pypi

Tino, Sushain, would it be possible to push a new version to Pypi?

I'm happy to do some maintenance to the repo first, to merge some of the
merge requests opened (like the dependabot ones).

Xavi

[1] https://github.com/apertium/apertium-python/issues/83

Missatge de Mikel L. Forcada  del dia ds., 30 de jul. 2022
a les 13:41:

> Dear apertiumers:
>
> I may have sent this message to the wrong list, or twice.  Apologies if
> you receive it more than once.
>
> Following up on Adam Bittlingmayer's Telegram message and this comment on
> StackOverflow,
>
>
> https://stackoverflow.com/questions/73151814/apertium-package-in-python-returns-module-not-installed-error-after-installing-m
>
>  I tried installing apertium on a python virtual environment:
>
> conda create -m apertium python=3
>
> conda activate apertium
>
> pip install apertium
>
> I wonder why it asked me for root permissions I would expect this to be
> different in a virtual environment (where it would have to install all of
> apertium!). I went on, and then, in a python shell, I executed his sequence
> of commands
>
> import apertium
> apertium.installer.install_module("eng")
> apertium.installer.install_module("deu")
> apertium.installer.install_module("eng-deu")
>
> I was surprised to see that the system asked me again for my root
> password, probably for monolingual or bilingual modules not installed.
>
> Then I ran
>
> r = apertium.Translator('eng', 'deu')print(r.translate('cats'))
>
> and sure, the first time it failed, saying something about the module not
> installed, but restarting the shell it worked!
>
> But then it segfaulted and dumped the core on a longer sentence (did not
> with other).
>
> >>> print(r.translate('Cats are beautiful.'))
> Violació de segment (s'ha bolcat la memòria)
>
> Any idea what may be going on?
>
> (1) why does it need to install apertium things with root privileges when
> ran from a virtual env?
>
> (2) why does it say the module is not installed unless I restart the shell?
>
> Thanks a million!
>
> Mikel
>
> --
> Mikel L. Forcada
> Dept. de Llenguatges i Sistemes Informàtics
> Edifici Politècnica IV,
> Universitat d'Alacant
> E-03690 Sant Vicent del Raspeig (Spain)
>
> Phone: +34 96 590 9776...@dlsi.ua.es
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium PMC Election: Bypass election?

2022-04-27 Thread Xavi Ivars
I'm OK with both approaches. If we want people voting, that's fine for me.

But also, voting for just to confirm (or also push back?) the only group of
people that volunteered seems a bit useless.

Maybe if someone outside the PMC gave their opinion, voting would make more
sense. But so far, it's been only the ones in the PMC (+ Sushain + Daniel),
everyone agreeing.


--
Xavi Ivars
< http://xavi.ivars.me >

El dc., 27 d’abr. 2022, 10:43, Tanmai Khanna  va
escriure:

> If the rest of the PMC is okay with it, I guess that's what we'll do. I
> still propose a vote so that we can have it on the record that the assembly
> of committers is okay with this decision. I get that by not replying
> there's an assumed consent but it really seems iffy when there's an
> election involved.
>
> It shouldn't take too long. Just a thought :)
>
> Tanmai
>
> On Wed, Apr 27, 2022, 14:05 Tino Didriksen  wrote:
>
>> There is precedence even in legislative bodies:
>> https://en.wikipedia.org/wiki/Unanimous_consent
>>
>> We've given ample time and updates, and we have a possible outcome that
>> can be achieved by unanimous consent. I say we take it and get on with the
>> business of forming the Foundation, which the (new) PMC can delegate the
>> legwork of to anyone.
>>
>> -- Tino Didriksen
>>
>>
>> On Tue, 26 Apr 2022 at 17:41, Tanmai Khanna 
>> wrote:
>>
>>> I get the sentiment but to be honest, not holding elections at all does
>>> dent credibility a little. Tino offered to stand for President, and so did
>>> Francis. Maybe we should have a vote, secure a mandate and then move on.
>>>
>>> That way it'll be clear to everyone that the leader is elected and not
>>> just selected unopposed.
>>>
>>> What do you guys think?
>>>
>>> On Tue, Apr 26, 2022, 19:54 Tino Didriksen 
>>> wrote:
>>>
>>>> G'day everyone,
>>>>
>>>> It's been a week, and we have 7 candidates for PMC and 2 for President:
>>>> https://github.com/apertium/elections
>>>>
>>>> - PMC: Francis M. Tyers, Jonathan N. Washington, Kevin Brubeck
>>>> Unhammer, Mikel L. Forcada, Tanmai Khanna, Tino Didriksen, Xavi Ivars
>>>>
>>>> - President: Francis, Tino
>>>>
>>>> Given that exactly 7 for PMC would avoid the need for an election, I'm
>>>> happy to forego the whole thing and let Francis continue in the role of
>>>> President.
>>>>
>>>> So, I propose that if this is still the status by end-of-day UTC
>>>> tomorrow (2022-04-27 23:59:59 UTC, ~34 hours from now), then we don't hold
>>>> the election and simply replace Sushain K. Cherivirala with Kevin Brubeck
>>>> Unhammer. CC'ed Sushain directly.
>>>>
>>>> -- Tino Didriksen
>>>>
>>> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium PMC Election: Census & Candidates

2022-04-21 Thread Xavi Ivars
Updated the elections repo with my name

Missatge de Tanmai Khanna  del dia dt., 19 d’abr.
2022 a les 19:32:

> Maybe in this new PMC we should strictly follow at least a once a month
> meeting. Even if there's nothing to discuss or no progress made, it helps
> to meet and keeps things moving. Just a thought
>
> On Tue, Apr 19, 2022, 22:59 Francis Tyers via Apertium-stuff <
> apertium-stuff@lists.sourceforge.net> wrote:
>
>> El 2022-04-19 12:20, Tino Didriksen va escriure:
>> > G'day everyone,
>> >
>> > It is almost time to vote for the
>> > https://wiki.apertium.org/wiki/Project_Management_Committee again, but
>> > first some introductory motions, namely updating the census and
>> > calling for candidates.
>> >
>> > Actual election will start in a week, barring complications.
>> >
>> > === Census:
>> > See https://github.com/apertium/elections/blob/main/census.tsv
>> >
>> > If you are not on the census list and want to be, or we have the wrong
>> > email for you, or you wish to be removed from the census, let us know
>> > or submit a PR for the change.
>> >
>> > === Candidates:
>> > Do you want to be a PMC member? Speak up!
>> >
>> > Do you want to be the Apertium President? Likewise, speak up!
>> >
>> > Or amend the https://github.com/apertium/elections repo.
>> >
>> > === The election itself:
>> > We will likely use https://www.belenios.org/ to run the election
>> > itself, and while that doesn't require as much trust in the election
>> > runners, it'd still be nice if someone who isn't planning on running
>> > for PMC or President will want to be one of the admins for the
>> > election. Any volunteers?
>> >
>> > -- Tino Didriksen
>>
>> I'd like to run for the PMC again. I'm happy to put myself forward as
>> president if no-one else wants to, although I have been pretty absent
>> and definitely not the most active contributor this past year.
>>
>> Fran
>>
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium Google Cloud owner?

2022-04-18 Thread Xavi Ivars
It didn't sound familiar, but I double-checked. I don't.

Missatge de Tino Didriksen  del dia dl., 18 d’abr.
2022 a les 21:00:

> Hello everyone,
>
> Who owns https://console.cloud.google.com/home/dashboard?project=apertium
> ?
>
> -- Tino Didriksen
>
>

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium? GSoC Org Apps close Monday, Feb 21 at 1800 UTC

2022-02-20 Thread Xavi Ivars
I was catching up with my email, and saw a message from Jonathan mentioning
it a couple of days ago


--
Xavi Ivars
< http://xavi.ivars.me >

El dg., 20 de febr. 2022, 17:30, Tino Didriksen  va
escriure:

> Forgot about GSoC application - are we doing it this year? Has anyone
> started the application yet?
>
> -- Tino Didriksen
>
>
> On Fri, 18 Feb 2022 at 22:23, 'sttaylor' via Google Summer of Code Mentors
> List  wrote:
>
>> Hi all,
>>
>> If your organization is interested in participating as a Mentor Org for
>> GSoC 2022 be sure to complete your organization application *before
>> Monday at 1800 UTC* by going to *g.co/gsoc <http://g.co/gsoc>*.
>>
>> Once you have completed the application you will receive an email letting
>> you know your application will be reviewed over the next two weeks with
>> orgs being contacted March 6th with their status. You can also verify you
>> successfully submitted your application by looking at your Org Admin
>> dashboard where you will see 'Application Submitted, Pending Approval.'
>>
>> Helpful Links
>>
>> Roles and Responsibilities
>> <https://developers.google.com/open-source/gsoc/help/responsibilities>
>>
>> Timeline <https://developers.google.com/open-source/gsoc/timeline>, FAQs
>> <https://developers.google.com/open-source/gsoc/faq>
>>
>> Marketing Materials
>> <https://developers.google.com/open-source/gsoc/resources/marketing>
>> (slide deck, flyers), Videos
>> <https://developers.google.com/open-source/gsoc/videos>
>>
>>
>> As always please feel free to email us at gsoc-supp...@google.com with
>> any questions. Thanks!
>>
>> Best,
>>
>> Stephanie Taylor, GSoC Program Lead
>>
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Changes on apertium-apy

2021-12-14 Thread Xavi Ivars
And, in the meantime, here you have updated images for apertium/base and
apertium/apy. They'll build on any commit to master, plus once a week
apertium/base

To use them, the same way as it's done with Docker Hub images

docker pull ghcr.io/apertium/apy:latest



Missatge de Xavi Ivars  del dia dt., 14 de des. 2021
a les 14:30:

> While looking back at this, I realized it's been more than 7 months that
> we haven't published any image to DockerHub, to the point that using
> apertium/apy:latest as a base image may not work in some cases (due to
> using a quite old debian version, being oldstable instead of stable now).
>
> For now, I've added another Github Action that will build and push to the
> GitHub Container Registry, and I'll do the same to push "apertium/base".
>
> I'll wait for Sushain's answer to also push those images to Docker Hub
>
>
>
> Missatge de Xavi Ivars  del dia dj., 9 de des. 2021
> a les 23:02:
>
>> Hi all,
>>
>> I've been doing a bit of cleanup on apertium-apy, and realized that
>> Travis hasn't been working for a while.
>>
>> To solve for it, I did an initial integration with GithubActions.
>>
>> I haven't spend too much time on it (Tino mentioned on IRC he's working
>> on a better CI for all modules) but I didn't want to allow merges with
>> tests failing.
>>
>> I'd love if those of you who have more expertise on Apy could take a look
>> at this:
>>
>> https://github.com/apertium/apertium-apy/pull/184
>>
>> Other things I've seen not properly working on that repo:
>> - Pushes to dockerhub
>> - Pushes to Pypi?
>> - Not sure if with my changes I broke the code coverage check
>>
>> If there are no objections, I'll probably merge it in the next couple of
>> days. Even if this is just an starting point, it should be already better
>> than what we had (no checks running at all)
>> --
>> < Xavi Ivars >
>> < http://xavi.ivars.me >
>>
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Changes on apertium-apy

2021-12-14 Thread Xavi Ivars
While looking back at this, I realized it's been more than 7 months that we
haven't published any image to DockerHub, to the point that using
apertium/apy:latest as a base image may not work in some cases (due to
using a quite old debian version, being oldstable instead of stable now).

For now, I've added another Github Action that will build and push to the
GitHub Container Registry, and I'll do the same to push "apertium/base".

I'll wait for Sushain's answer to also push those images to Docker Hub



Missatge de Xavi Ivars  del dia dj., 9 de des. 2021 a
les 23:02:

> Hi all,
>
> I've been doing a bit of cleanup on apertium-apy, and realized that Travis
> hasn't been working for a while.
>
> To solve for it, I did an initial integration with GithubActions.
>
> I haven't spend too much time on it (Tino mentioned on IRC he's working on
> a better CI for all modules) but I didn't want to allow merges with tests
> failing.
>
> I'd love if those of you who have more expertise on Apy could take a look
> at this:
>
> https://github.com/apertium/apertium-apy/pull/184
>
> Other things I've seen not properly working on that repo:
> - Pushes to dockerhub
> - Pushes to Pypi?
> - Not sure if with my changes I broke the code coverage check
>
> If there are no objections, I'll probably merge it in the next couple of
> days. Even if this is just an starting point, it should be already better
> than what we had (no checks running at all)
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Question about regtest

2021-12-10 Thread Xavi Ivars
Hi Daniel, Tino,

I've been running apertium-regtest on apertium-spa-cat, and there are a ton
of changes, I think created after we added preferences to the langpair.

Now, there are a lot of differences on *-gold.txt files, that look like
this:

[-DumAiorwCM6]
-Julio González i Jesús Marzo. Juli Cèsar.
+Julio González i Jesús Marzo. Juli Cèsar. [/option]
[/-DumAiorwCM6]


But I have no idea if that is related or not, and with the # of changes
that exist (web UI is showing 27 pages), it's quite hard to really
understand what's going on

Any help?


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Github Actions

2021-12-10 Thread Xavi Ivars
Actually, this has helped me already to identify two issues:
apertium-spa-cat and apertium-fra tests (using apertium-regtest) are broken.

I'm pretty sure the contributors to those packages don't really know how
regtest works, and what needs to be done to fix those tests.

Missatge de Xavi Ivars  del dia dv., 10 de des. 2021
a les 18:53:

> While playing this morning with some fixes on Apertium Apy, I added a
> Github Action to make sure tests were running and code coverage stats were
> pushed to Coveralls.io, and I really liked how Github Actions work.
>
> To better test the possibilities Github Actions gives us, I setup the same
> scripts we already had in TravisCI (not working now) on Github Actions, for
> apertium-spa, apertium-cat and apertium-spa-cat.
>
> And I found it's way more powerful, and we can do pretty cool things.
>
> This is what I've setup:
>
> *Apertium Github Actions repository [1]*
>
> This repository contains reusable workflows and actions, that can be
> referenced by other repository specific workflows.
>
> There are currently 2 workflows already implemented, one for monolingual
> [2] and another one for bilingual [3] modules.
>
> *Apertium Github repository [4] *
>
> This repository already existed, and was used to define Apertium's profile
> in Github. I've added there "workflow templates", what makes it extremely
> easy to setup a reusable workflows into an existing module.
>
> As an example, this is what would be required to setup the monolingual
> workflow in apertium-fra:
>
>1. Go to actions tab
>2. (In case workflows already exist) click on create new workflow
>3. Scroll down, and select the appropriate workflow. In case of
>apertium-fra, it would be the monolingual
>
> [image: image.png]
> 4. In case of bilingual replace xxx-yyy, xxx and yyy with the language
> codes of your bilingual module, and the monolinguals.
> 5. DONE! Tests are integrated!
>
>
>
> References:
> [1] https://github.com/apertium/github-actions/tree/master
> [2]
> https://github.com/apertium/github-actions/tree/master#monolingual-buildyml
> [3]
> https://github.com/apertium/github-actions/tree/master#bilingual-buildyml
> [4] https://github.com/apertium/.github
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Github Actions

2021-12-10 Thread Xavi Ivars
While playing this morning with some fixes on Apertium Apy, I added a
Github Action to make sure tests were running and code coverage stats were
pushed to Coveralls.io, and I really liked how Github Actions work.

To better test the possibilities Github Actions gives us, I setup the same
scripts we already had in TravisCI (not working now) on Github Actions, for
apertium-spa, apertium-cat and apertium-spa-cat.

And I found it's way more powerful, and we can do pretty cool things.

This is what I've setup:

*Apertium Github Actions repository [1]*

This repository contains reusable workflows and actions, that can be
referenced by other repository specific workflows.

There are currently 2 workflows already implemented, one for monolingual
[2] and another one for bilingual [3] modules.

*Apertium Github repository [4] *

This repository already existed, and was used to define Apertium's profile
in Github. I've added there "workflow templates", what makes it extremely
easy to setup a reusable workflows into an existing module.

As an example, this is what would be required to setup the monolingual
workflow in apertium-fra:

   1. Go to actions tab
   2. (In case workflows already exist) click on create new workflow
   3. Scroll down, and select the appropriate workflow. In case of
   apertium-fra, it would be the monolingual

[image: image.png]
4. In case of bilingual replace xxx-yyy, xxx and yyy with the language
codes of your bilingual module, and the monolinguals.
5. DONE! Tests are integrated!



References:
[1] https://github.com/apertium/github-actions/tree/master
[2]
https://github.com/apertium/github-actions/tree/master#monolingual-buildyml
[3]
https://github.com/apertium/github-actions/tree/master#bilingual-buildyml
[4] https://github.com/apertium/.github
-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Changes on apertium-apy

2021-12-09 Thread Xavi Ivars
Hi all,

I've been doing a bit of cleanup on apertium-apy, and realized that Travis
hasn't been working for a while.

To solve for it, I did an initial integration with GithubActions.

I haven't spend too much time on it (Tino mentioned on IRC he's working on
a better CI for all modules) but I didn't want to allow merges with tests
failing.

I'd love if those of you who have more expertise on Apy could take a look
at this:

https://github.com/apertium/apertium-apy/pull/184

Other things I've seen not properly working on that repo:
- Pushes to dockerhub
- Pushes to Pypi?
- Not sure if with my changes I broke the code coverage check

If there are no objections, I'll probably merge it in the next couple of
days. Even if this is just an starting point, it should be already better
than what we had (no checks running at all)
-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Bitrotted releases redux

2021-03-30 Thread Xavi Ivars
Tested against v1.2.1 of apertium-spa, runinng some translations in both
pairs, and it seems working pretty well.

Ideally, we'd need to do a testvoc, but I'm not sure it's worth: what we
have now in master is probably better than whatever was last published.

Missatge de Xavi Ivars  del dia dt., 30 de març 2021
a les 19:11:

> Oh, that's true.
>
> Should we just copy it locally to that pair as a stop gap?
>
> Or just point it to *any* spa version (including latest) should work
> I can run some tests locally, I'll give it a try against v1.2.1 and I'll
> report back.
>
> Missatge de Daniel Swanson  del dia dt., 30
> de març 2021 a les 19:06:
>
>> -es-gl has its own dictionaries, but depends on -spa's .rlx file.
>>
>> On Tue, Mar 30, 2021 at 11:59 AM Xavi Ivars  wrote:
>> >
>> >
>> >
>> > Missatge de Tino Didriksen  del dia dl., 29 de
>> març 2021 a les 22:00:
>> >>
>> >> https://github.com/apertium/organisation/issues/23 is still open, and
>> these languages/pairs need someone to sign off on current state or make a
>> formal new release.
>> >>
>> >> * https://github.com/apertium/apertium-cym
>> >> * https://github.com/apertium/apertium-hin
>> >> * https://github.com/giellalt/lang-sme
>> >> * https://github.com/apertium/apertium-es-gl and must depend on a
>> specific release of https://github.com/apertium/apertium-spa
>> >> * https://github.com/apertium/apertium-cym-eng and must depend on a
>> specific release of https://github.com/apertium/apertium-cym and
>> https://github.com/apertium/apertium-eng
>> >> * https://github.com/apertium/apertium-eng-spa
>> >> * https://github.com/apertium/apertium-sme-nob
>> >> * https://github.com/apertium/apertium-spa-ita and must depend on a
>> specific release of https://github.com/apertium/apertium-spa and
>> https://github.com/apertium/apertium-ita
>> >> * https://github.com/apertium/apertium-urd-hin and must depend on a
>> specific release of https://github.com/apertium/apertium-urd and
>> https://github.com/apertium/apertium-hin
>> >>
>> >> The existing releases cannot build and/or run with current Apertium
>> tools. These are holding back Apertium 3.7 from the main site. The
>> alternative is that I remove these languages/pairs from the main site.
>> >>
>> >
>> > Hey Tino,
>> >
>> > I was looking into this, and I'm not sure why apertium-es-gl needs to
>> depend on a specific release of spa. As far as I see, this package hasn't
>> still moved to use monolingual packages, and has its own Spanish
>> dictionaries, tagger, etc. Am I missing something?
>> > --
>> > < Xavi Ivars >
>> > < http://xavi.ivars.me >
>> > ___
>> > Apertium-stuff mailing list
>> > Apertium-stuff@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Bitrotted releases redux

2021-03-30 Thread Xavi Ivars
Oh, that's true.

Should we just copy it locally to that pair as a stop gap?

Or just point it to *any* spa version (including latest) should work I
can run some tests locally, I'll give it a try against v1.2.1 and I'll
report back.

Missatge de Daniel Swanson  del dia dt., 30 de
març 2021 a les 19:06:

> -es-gl has its own dictionaries, but depends on -spa's .rlx file.
>
> On Tue, Mar 30, 2021 at 11:59 AM Xavi Ivars  wrote:
> >
> >
> >
> > Missatge de Tino Didriksen  del dia dl., 29 de
> març 2021 a les 22:00:
> >>
> >> https://github.com/apertium/organisation/issues/23 is still open, and
> these languages/pairs need someone to sign off on current state or make a
> formal new release.
> >>
> >> * https://github.com/apertium/apertium-cym
> >> * https://github.com/apertium/apertium-hin
> >> * https://github.com/giellalt/lang-sme
> >> * https://github.com/apertium/apertium-es-gl and must depend on a
> specific release of https://github.com/apertium/apertium-spa
> >> * https://github.com/apertium/apertium-cym-eng and must depend on a
> specific release of https://github.com/apertium/apertium-cym and
> https://github.com/apertium/apertium-eng
> >> * https://github.com/apertium/apertium-eng-spa
> >> * https://github.com/apertium/apertium-sme-nob
> >> * https://github.com/apertium/apertium-spa-ita and must depend on a
> specific release of https://github.com/apertium/apertium-spa and
> https://github.com/apertium/apertium-ita
> >> * https://github.com/apertium/apertium-urd-hin and must depend on a
> specific release of https://github.com/apertium/apertium-urd and
> https://github.com/apertium/apertium-hin
> >>
> >> The existing releases cannot build and/or run with current Apertium
> tools. These are holding back Apertium 3.7 from the main site. The
> alternative is that I remove these languages/pairs from the main site.
> >>
> >
> > Hey Tino,
> >
> > I was looking into this, and I'm not sure why apertium-es-gl needs to
> depend on a specific release of spa. As far as I see, this package hasn't
> still moved to use monolingual packages, and has its own Spanish
> dictionaries, tagger, etc. Am I missing something?
> > --
> > < Xavi Ivars >
> > < http://xavi.ivars.me >
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Bitrotted releases redux

2021-03-30 Thread Xavi Ivars
Missatge de Tino Didriksen  del dia dl., 29 de març
2021 a les 22:00:

> https://github.com/apertium/organisation/issues/23 is still open, and
> these languages/pairs need someone to sign off on current state or make a
> formal new release.
>
> * https://github.com/apertium/apertium-cym
> * https://github.com/apertium/apertium-hin
> * https://github.com/giellalt/lang-sme
> * https://github.com/apertium/apertium-es-gl and must depend on a
> specific release of https://github.com/apertium/apertium-spa
> * https://github.com/apertium/apertium-cym-eng and must depend on a
> specific release of https://github.com/apertium/apertium-cym and
> https://github.com/apertium/apertium-eng
> * https://github.com/apertium/apertium-eng-spa
> * https://github.com/apertium/apertium-sme-nob
> * https://github.com/apertium/apertium-spa-ita and must depend on a
> specific release of https://github.com/apertium/apertium-spa and
> https://github.com/apertium/apertium-ita
> * https://github.com/apertium/apertium-urd-hin and must depend on a
> specific release of https://github.com/apertium/apertium-urd and
> https://github.com/apertium/apertium-hin
>
> The existing releases cannot build and/or run with current Apertium tools.
> These are holding back Apertium 3.7 from the main site. The alternative is
> that I remove these languages/pairs from the main site.
>
>
Hey Tino,

I was looking into this, and I'm not sure why apertium-es-gl needs to
depend on a specific release of spa. As far as I see, this package hasn't
still moved to use monolingual packages, and has its own Spanish
dictionaries, tagger, etc. Am I missing something?
-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Proper noun classification considered harmful

2021-02-02 Thread Xavi Ivars
Hèctor, please correct me if I am wrong.

In Catalan, for example, we have gender annotated for proper nouns, because
as Hèctor explained, it's useful in the some cases when translating to
French. So Catalan monolingual generates rich tags for np.

However, when translating to Spanish, that information (from Catalan) is
not that useful, so we didn't bother adding genders there. And the way we
managed it was adding RL rules both in spa and cat that consume
"genderless" nps, regardless of how they are generated.

So I can think that could be an approach: annotate only when output is
useful, but account for simpler input when generating.

--
Xavi Ivars
< http://xavi.ivars.me >

El dt., 2 de febr. 2021, 22:40, Kevin Brubeck Unhammer 
va escriure:

> Hèctor Alòs i Font 
> čálii:
>
> > I am more sceptical about the need to distinguish between toponyms and
> > hydronyms. In some languages one will have an article and the other will
> > not, but these are rare cases. On the other hand, we do not distinguish
> > between countries (or regions) and cities, which in French is quite
> > important both for generating the article and the preposition preceding
> it,
> > if you translate from Catalan or Spanish: for instance, "New-York" is the
> > city, but "le New-York" is the state, so will have "à New-York" or "au
> > New-York" for "in New-York" (or "à Paris" but "en France").  The
> generation
> > of articles may also not be the same whether "Barcelona" stands for the
> > city or the (football or whatever) team, nor is the gender often the
> same.
> > So, are we then going to create more and more subtypes ad nauseam? Better
> > not!
> >
> > In short, we can find casuistries in certain pairs that may make us think
> > that some distinctions are appropriate, but adding them in monolingual
> > dictionaries and forcing them to be maintained for all languages seems
> > doubtful to me.
>
> So the city-vs-region distinction is only useful for target (structural)
> generation, not source analysis/disambiguation/anaphora. I think that
> can be a good guide to when something should be in monodixen or not.
>
> One solution here would be to add it in bidix (with a pardef so you
> don't need it when going the other way) and strip it in transfer, or
> even just use a def-list in the transfer files.
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] An easy tool to report bad translations and propose alternatives

2020-12-05 Thread Xavi Ivars
I did that for quite a long time for Softcatalà, and gave up: most of the
reports were actually wrong or useless. From people that picked the wrong
language pair, to people who was suggesting wrong translations, including
many people writing something like "this is wrong" in the form.


--
Xavi Ivars
< http://xavi.ivars.me >

El ds., 5 de des. 2020, 12:24, Tino Didriksen  va
escriure:

> We can trivially make a Report Bad Translation button on the website that
> pops up a 3-field dialog, where the input (static), output (static), and
> user's correction (that they fill in) can be submitted to a database.
>
> -- Tino Didriksen
>
>
> On Sat, 5 Dec 2020 at 10:28, Hèctor Alòs i Font 
> wrote:
>
>> A Sardinian collaborator commented to me: "Wouldn't it be possible that
>> every time there are more possible translations these come out in a little
>> window where the user chooses the right solution, as in spell checkers"?
>>
>> This could be an idea for a GSoC tool project. Nevertheless, I don't
>> think that, as he puts it, this is the best option because, in general, we
>> have few multiple options in the bilingual dictionaries. Probably, another
>> type of interface would be more appropriate. Is there anything done in the
>> GSoC projects that could be used?
>>
>> With him, we use a simple spreadsheet in a Google Documents-like system.
>> He enters a word or phrase, the current translation, the suitable
>> translation and the context (sentence). This is not at all intuitive, nor
>> easy, for a conventional user, but it is very useful. We have already dealt
>> with several hundred errors in the Italian-Sardinian translator.
>>
>> Hèctor
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Releases for everything

2020-11-16 Thread Xavi Ivars
I hate being a pain in the *** here, so I'll start by apologizing for
being so annoying with this but...

Is there anything I can do to help with the releases?

Missatge de Tino Didriksen  del dia dc., 21 d’oct.
2020 a les 9:43:

> The world conspired against me being productive in any way, so everything
> stalled.
>
> But since then, https://github.com/apertium/apertium/issues/108 has come
> up, which feels like a larger underlying issue. I hope
> https://github.com/apertium/lttoolbox/pull/105 will fix that - will test
> that today.
>
> -- Tino Didriksen
>
>
> On Tue, 20 Oct 2020 at 23:51, Xavi Ivars  wrote:
>
>> Hi!
>>
>> Any updates on new releases?
>>
>> I was looking at the current released versions of "apertium", and it
>> seems that it's 3.6.1, which was released last October (2019).
>>
>> Tag 3.6.3 (released July 1st 2020) hasn't been released yet, and nothing
>> that includes all the amazing work done this summer (including new
>> blank/format handling, posttransfer, etc) has been released either (it
>> hasn't even been tagged in Github).
>>
>> Are there any blockers for these releases to happen?
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Releases for everything

2020-10-20 Thread Xavi Ivars
Hi!

Any updates on new releases?

I was looking at the current released versions of "apertium", and it seems
that it's 3.6.1, which was released last October (2019).

Tag 3.6.3 (released July 1st 2020) hasn't been released yet, and nothing
that includes all the amazing work done this summer (including new
blank/format handling, posttransfer, etc) has been released either (it
hasn't even been tagged in Github).

Are there any blockers for these releases to happen?

Missatge de Xavi Ivars  del dia dl., 5 d’oct. 2020 a
les 19:30:

> Hey Tino,
>
> Do we have any idea on when the new releases will be ready?
>
> Missatge de Tino Didriksen  del dia dl., 21 de
> set. 2020 a les 18:45:
>
>> Given that Apertium has undergone big changes and a binary compat break
>> since last formal release, just about everything is getting re-released and
>> re-packaged.
>>
>> So if any of you have any changes that will break binary compat, and that
>> you can commit this week, now would be a good time to make those.
>>
>> -- Tino Didriksen
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] A talk evaluating Apertium

2020-10-19 Thread Xavi Ivars
El dl., 19 d’oct. 2020, 22:10, Jaume Ortolà i Font 
va escriure:

>
> No. The corpus was not postedited. It has 2 million sentences. I tried to
> get a Catalan translation as good as possible. What I did was:
>
> - Try to cover all relevant vocabulary: all non-capitalized words that
> appear at least 4-5 times in the corpus.
> - Fix spelling and grammar errors in the Spanish corpus using LanguageTool
> (for example, missing diacritics or agreement errors). The Spanish text is
> worse than expected.
> - Fix many common errors in spa-cat Apertium translation.
>
> This work is not complete. To finish it, we'll need probably 3-4 months of
> full-time work or more. Anyway, a neural translator can work even if a
> percentage of the corpus is not perfect.
>


Sorry, I said postedited, but I meant "tweaked". Jaume (who did most, if
not all, the work) has explained it.
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] A talk evaluating Apertium

2020-10-19 Thread Xavi Ivars
Well, that's only "part" of the corpus... and for the Europarl, that part
of corpus was not left "as is" after Apertium, but also postedited.

The talk was specifically about eng-cat, and in that case, for the NMT
model, Apertium was not involved.
-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] A talk evaluating Apertium

2020-10-17 Thread Xavi Ivars
I actually think apertium eng-cat may be one of the english-romanic pairs
that has received more care. Marc has worked quite a lot on it since he did
it during a GSoC, but has continued working on his it adding newly
developer modules (perceptron tagger, separable, lexical selection,
anaphora resolution,...)

I agree with Mikel that the evaluation is well made. And after having tried
the neural a few times, it will probably be hard to get there, in terms if
fluency, with Apertium. Translations with Apertium are, usually, more
"robotic".

--
Xavi Ivars
< http://xavi.ivars.me >

El dv., 16 d’oct. 2020, 19:35, Tanmai Khanna  va
escriure:

> Not to get too defensive haha but in the talk makes certain statements,
> such as "Neural machine translation is more fluent and adequate compared to
> RBMT" (not verbatim), and later on a comparison of post-editability, but
> doesn't really comment on the status of these tools, i.e. how much data
> were the NMT systems trained on, and how much is the English-Catalan system
> of Apertium is worked on (which we know is not a lot), so I was just
> surprised to see general statements against RBMT. Not saying this as a
> member of Apertium but generally evaluations like this should be done
> carefully and all the factors should be considered before comparing
> multiple systems, without that, it's easy to arrive at false conclusions.
>
> My two cents :)
> *तन्मय खन्ना *
> *Tanmai Khanna*
>
>
> On Thu, Oct 15, 2020 at 9:27 PM Mikel L. Forcada  wrote:
>
>> Dear Apertiumers:
>>
>> here's a 20-minute talk from Vicent Briva where he evaluates Apertium
>> English–Catalan in comparison with the SoftCatalà neural engine and
>> Google Translate.
>>
>> I think the evaluation is quite well made.
>>
>> https://www.youtube.com/watch?v=IiRVhAYpecw
>>
>> We do not fare too well but, hey, we know this language pair needs love.
>>
>> All the best,
>>
>>
>> Mikel
>>
>>
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Releases for everything

2020-10-05 Thread Xavi Ivars
Hey Tino,

Do we have any idea on when the new releases will be ready?

Missatge de Tino Didriksen  del dia dl., 21 de set.
2020 a les 18:45:

> Given that Apertium has undergone big changes and a binary compat break
> since last formal release, just about everything is getting re-released and
> re-packaged.
>
> So if any of you have any changes that will break binary compat, and that
> you can commit this week, now would be a good time to make those.
>
> -- Tino Didriksen
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] let's move the mailing lists to sourcehut

2020-09-23 Thread Xavi Ivars
At Softcatalà we've used mailman for years, and we haven't had any problem
so far.

We're actually rethinking our mailing list strategy: we've moved a lot of
conversation to specific Telegram groups, etc.

But if the concern to self-host it is maintenance, is not *that hard*.

Missatge de Daniel Swanson  del dia dc., 23 de
set. 2020 a les 16:45:

> https://sourceforge.net/p/forge/documentation/Mailing List Archives/
>
> It looks like at least exporting them is possible.
>
> On Wed, Sep 23, 2020 at 9:40 AM Francis Tyers  wrote:
>
>> El 2020-09-23 15:03, Jonathan Washington escribió:
>> > One other question:
>> >
>> > Will it be possible to move existing apertium-stuff (and PMC, etc)
>> > archives to the new location?  Or would we be starting over with those
>> > archives?
>> >
>>
>> I think the PMC list is a distribution list, not a mailing list, so we
>> don't have archives.
>>
>> And in terms of the archives. I doubt it, but perhaps SourceForge has
>> a way to export them...
>>
>> Fran
>>
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Need help to decide language pairs for examples of markup handling

2020-09-09 Thread Xavi Ivars
Missatge de Tanmai Khanna  del dia dc., 9 de set.
2020 a les 11:34:

> Hey guys,
> I'm writing a system demonstration to be submitted at LowResMT 2020 about
> the recent project that was done as part of GSoC, titled "Translating the
> internet into low resource languages with Apertium" (Accepting snazzier
> title suggestions).
>
> As part of this demonstration, I want to show some real world examples of
> how the new system of markup handling will help the translation of webpages
> and formatted documents - odt, pptx, rtx, etc. To show this effectively, I
> need to choose 3-4 released language pairs that are sufficiently
> syntactically divergent that they show the effect of markup reordering in
> the translation output. As far as I know, spa-cat is one of our most mature
> pairs, however I'm not sure how syntactically divergent it is. If it is,
> then I'm happy to be corrected. If your language pair has had issues with
> webpage translation and those issues are now solved (ish), then some
> examples would be really helpful.
>
>
Spanish and Catalan are very similar in terms of syntax. We could
definitely try to get examples of where diverge the most, but those
examples would need to be completely synthetic.

Markup handling helps, though, in markup handling on different areas: some
formats where inline tags are common (like ODT), previous
formatter/deformatter was splitting words where tags appeared, so
translation of those has improved quite a lot.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] We need more evaluators

2020-08-20 Thread Xavi Ivars
Hey Mikel,

You can add me to en-ca. I don't know too much french, but I guess I could
also help there

Missatge de Mikel L. Forcada  del dia dj., 20 d’ag. 2020 a
les 9:39:

> Dear Apertiumers:
>
> About 10 days ago I wrote a message asking for apertiumers to volunteer in
> evaluating sets of 150 bilingual dictionary entries. We still need more
> evaluators.
>
> We need two more evaluators for Esperanto–English, two more for
> French–Catalan and one for Occitan–French.
>
> Please contact our GSoC student Shashwat Goel
>  
> if you can help.
>
> Thanks a million!
>
> All the best
>
> Mikel Forcada
>
> [New] En - Es - Assigned to Hector, Mikel, Jorge. More people not required
> [New] En - Ca - Assigned to Hector, Mikel. One more person would be
> helpful.
> Eo - En - Done by Hector. Two more people would be helpful.
> Fr - Ca - Done by Hector. Two more people would be helpful.
> Oc - Fr - Done by Serge. Assigned to Gisele. One more person would be
> helpful.
>
> --
> Mikel L. Forcada  http://www.dlsi.ua.es/~mlf/
> Departament de Llenguatges i Sistemes Informàtics
> Universitat d'Alacant
> E-03690 Sant Vicent del Raspeig
> Spain
> Office: +34 96 590 9776
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Testing markup reordering in Apertium

2020-07-20 Thread Xavi Ivars
Hello! I just found a (potential) issue, and wanted to double check with
you (it's probably something you already looked into and is not a real
issue): looking at transfuse's code, I saw tf-mangle-mode is doing tf-close
on the generation step.

How does it work when postgen steps merges some of words?


--
Xavi Ivars
< http://xavi.ivars.me >

El dl., 20 de jul. 2020, 20:34, Tanmai Khanna  va
escriure:

> Hey guys,
> Using wordbound blanks, we've modified the Apertium pipeline, modules and
> stream such that inline markup tags now move around with words in transfer,
> merge when LUs merge, split when LUs split, to preserve the formatting of
> the input document. If you want to follow the further development of this
> project, see here
> <https://wiki.apertium.org/wiki/User:Khannatanmai/Wordbound_blanks>.
>
> We have a decent version that is ready to test that does markup handling
> for html documents. It will undergo extensive testing as part of this
> project, but I thought it'll be a good idea to let the community test it
> themselves on their language pairs based on their needs so that we can
> understand what features need to be added, and what needs to be fixed.
> Apertium users have been asking for markup handling for quite some time now
> and had no other option but to use wrappers that try to guess alignments.
> I'm hoping this project helps in that regard. Here's what you need to test
> this:
> - Make sure you have the latest commits of apertium and lttoolbox
> installed.
> - Latest commits of -separable, -anaphora, etc. if you're using those in
> your mode.
> - Clone and install https://github.com/TinoDidriksen/transfuse .
>
> After this all you need to do is pipe your html document to
> tf-html-fragment and give as argument a translation mode of your language
> pair of choice (full translation modes).
>
> Example:
>
> $ echo 'Hello big green world!' | tf-html-fragment
> /Users/khannatanmai/Documents/GSoC/repo/main/apertium-eng-spa/modes/eng-spa.mode
>
>
> Hola Mundo verde grande !
>
>
> It only works for html right now, but we're in the process of supporting
> all usual document types.
>
>
> *Known issues:*
>
> - If a transfer rule has multiple words in the pattern, and in the output
> there is a LU that wasn't clipped from any word in the input, it won't put
> a wordbound blank on that LU.
>
> - If -separable detects a string of words then the format of each will be
> combined and added on the entire string of words.
>
> - apertium-recursive isn't supported as of now. It will be by the end of
> the project though.
>
>
> If you have any questions, suggestions, I'd be glad to respond to them on
> this thread. If you need help testing this on your language pair you can
> contact us on the IRC. Same if you find any bugs, or have any feature
> requests.
>
>
> Enjoy!
> *तन्मय खन्ना *
> *Tanmai Khanna*
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Semantics in Apertium (was Apertium's Wider Use & Secondary Tags)

2020-06-15 Thread Xavi Ivars
>
> Just because we "can" add information, does not mean we "should".
>

Yes, I agree. But I think the "material" example that Hèctor raised (*for
instance, as a rule, Catalan preposition "de" is translated as "de" in
French, but if the following word is a material, "en" must be selected (de
fusta > en bois*) is a good one where the transfer (an improved one, for
sure) would also benefit on having that information available.

Missatge de Francis Tyers  del dia dl., 15 de juny
2020 a les 18:45:

> El 2020-06-15 17:38, Hèctor Alòs i Font escribió:
> > Here come several practical examples. I tried to select them for their
> > variety. The result is more a wish list than something structured.
> >
> > Let's begin with "je la baise". Depending on the context this may be
> > "I kiss her" or "I fuck her". The context can tell us if we are in a
> > formal or colloquial type of language. Another issue is that in this
> > case the anaphora resolution can also help us: if the pronoun
> > reference is "hand", it can only be "kiss"; if it is a person, the
> > doubt persists.
> >
> > Another kind of problem is the Arpitan words "chamô" ("camel"; plural
> > "camels") and "chamôs ("chamois"; unchanged in plural). So,
> > translating into French, I got yesterday chamois in a Bible text of
> > Exodus xD  I solved it deciding in a CG rule that all "chamôs"
> > (without nothing around in singular) are camels. (Similar cases in
> > French: fil/fils, foi/fois, cour/cours)
> >
> > In French there are plenty of words with different meanings, depending
> > on the genre: livre, page, tour, etc. The problem is that often the
> > immediate surrounding context does not disambiguate: des livres, les
> > pages, de tour, etc. A similar but slightly different case is the word
> > pairs homicide mf/homicide m, féminicide mf/féminicide m, parricide
> > mf/parricide, etc.: the one with the genre "mf" is a person and the
> > other is the action.
> >
> > Other problems come in lexical selection. For instance, as a rule,
> > Catalan preposition "de" is translated as "de" in French, but if the
> > following word is a material, "en" must be selected (de fusta > en
> > bois). So in the Catalan2French lrx file we have a list of materials,
> > as we have a list of countries, a list of musical instruments, a list
> > of animals, etc. I dream about a monolingual dictionary where we could
> > get this kind of information. It is not useful to have these lists for
> > many language pairs using Catalan. This information should be in
> > apertium-cat and not in every apertium-cat-xxx lrx file.
> >
> > Moreover, If we had words not only with different kind of semantic
> > labels, but also marked as synonyms, maybe it'd be possible to give a
> > translation using a word labeled as synonym (if it has a translation)
> > instead of "unknown".
> >
>
> These are excellent examples, I'm just about to go out, but will address
> them when I get back. Thanks for the ideas..
>
> Note that my suggestion was to include this information
> in the monolingual packages.
>
> Fran
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Semantics in Apertium (was Apertium's Wider Use & Secondary Tags)

2020-06-15 Thread Xavi Ivars
Missatge de Francis Tyers  del dia dl., 15 de juny
2020 a les 17:26:

>
> [...]
>
> And pass it to the lexical selection module which will choose the
> one with the highest weight.
>
> This would mean a new module, but it would require only minor
> changes to the bilingual dictionary and lexical selection, and
> wouldn't have any effect on transfer.
> [...]


The difference between your approach and mine is that your proposal is
extremely coupled to the order of the modules in the pipeline. The new
module would write <2.0> and apertium-lex-tools would need to read and
remove it from the pipeline.

Ideally, I'd like to decouple setting the "domain" of a word from using it.
If something just after tagger, still as part of the "analysis" phase of
the translation, puts that information in there, then it can be used by
"lex-tools", but also by other modules that may need it. If we don't do
this, multiple modules may need to read the "domain list" data to assign
the right domain to a given word.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium's Wider Use & Secondary Tags

2020-06-13 Thread Xavi Ivars
El dg., 14 de juny 2020, 0:18, Jonathan Washington <
jonathan.n.washing...@gmail.com> va escriure:

>
>
> I think Xavi's point is that there are a number of ways to approach this,
> and having the option of another stream to put this extra information could
> be one of them.  Imho, it is nicer in many ways than even having (very
> arbitrary) superscripts (that aren't really any better to have in a
> morphological analysis than _fruta).
>

Exactly. Ideally, I wouldn't want to add superscripts (or _fruta) that may
break other langpairs (or even monolingual usages).

Instead, if I could add semantic information in a monolingual step, then I
would use that only in the lexical selection of the language pairs thar
need it.


--
Xavi Ivars
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium's Wider Use & Secondary Tags

2020-06-13 Thread Xavi Ivars
Before anything, let me say that I like the proposal to enhance the
pipeline with more data (including, but not limited to the surface forms),
to be able to do properly do things that currently we're doing in vry
hacky (to me) and definitely non-linguistic ways

xavi@dell:~/src/apertium-spa$ echo "El mango" | apertium -d . spa-morph
^El/el$ ^mango/mango/mangar/
*mango_fruta*$^./.$


In this example, we "add" semantic information to the pipeline (and
disambiguate via CG3) by creating a "fake lemma" needed for SPA-CAT,
because "mango" (pan stick) and "mango_fruta" are translated
differently in Catalan. But this, in turn, forces every other language pair
using Spanish to know about "mango_fruta" even if the translation was
the same as "mango".

And yes, I know this example could also be solved by using lex-tools, where
the translation would change based on the context. But "the rules" to
decide if it's "mango" or "mango_fruta" do not depend on the
translation, but completely on the source language. Ideally, I'd like to
have a module in apertium-spa that allows me to add this semantic
information (that can perfectly be one instance of lex-tools), and then be
able to use it (or not) in different language pairs.

I was going to talk about the points Fran raises, that I think are
extremely valuable. But I think Tanmai's answer (that came while I was
writing this) addresses them better than I would. With source identifiers,
we can keep the compounds and contractions information as it was in the
source, and then decide what to do with it.

But I also don't think we can ask this implementation to solve all current
and future problems apertium pipeline format has. As long as backwards
compatibility is ensured, I don't see why having "more data" available can
generate any problem. And if, for any reason, it turns out that for the
specific problem of passing over the surface form can't be used in all
cases, I still think "being able to do it" (again, while ensuring backwards
compatibility) is worth for the cases that will be useful (and, again, for
non-developed pairs with extremely developed monolingual dictionaries,
being able to avoid trimming to pass morphological information to the
transfer would be a HUGE win).

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] ADDCOHORT in Constraint Grammar

2020-05-28 Thread Xavi Ivars
The reason I was asking was exactly because of that: we're not trying to
rewrite multi-tokens into smaller units but the opposite: expand smaller
units into multiple ones.

But just to make sure: not because I thought it doesn't belong there, but
because I really don't know what they're actual scope of separable is
(except of having used it for a few phrasal verbs in eng-cat)


--
Xavi Ivars
< http://xavi.ivars.me >

El dv., 29 de maig 2020, 0:39, Francis Tyers  va
escriure:

> El 2020-05-28 23:12, Xavi Ivars escribió:
> > How would this fit in apertium-separable?
> >
> > As far as I know the goal of apertium separable is to handle
> > multi-words in a better way than in the monodixes.
> >
> > I totally get (and totally agree) that we should put in transfer only
> > stuff that is really about transfer between both languages and we
> > don't want to abuse it... But is that a good enough reason to abuse
> > another module? Or may it be the case that apertium-separable should
> > handle a broader set of use cases (and probably change its name)?
> >
> > --
> > Xavi Ivars
> > < http://xavi.ivars.me >
> >
> > El dj., 28 de maig 2020, 16:14, Jonathan Washington
> >  va escriure:
> >
> >> This could definitely be done in apertium-separable.  That would be
> >> by far the most straightforward way to solve this problem.  And if
> >> you did it as a language-specific lsx file as has been being
> >> discussed recently, it would serve the purpose you describe.
> >>
> >> Don't treat it as a structural transfer issue.  The less lexical
> >> stuff in transfer the better.
> >>
> >> --
> >> Jonathan
> >>
> >> On Thu, May 28, 2020, 06:47 Jaume Ortolà i Font
> >>  wrote:
> >>
> >> Isn't this something that should go in transfer,
> >>
> >> Dropping this "que" is possible in Spanish, but it is not regular
> >> syntax, it is a mannerism used in bureaucratic jargon. The regular
> >> syntax is with "que". It makes sense to add it, so all language
> >> pairs can translate as usual.
> >>
> >> Transfer is extremely annoying for this kind of things, in my
> >> experience.
> >>
> >> or you could
> >> use apertium-separable for it?
> >>
> >> Probably yes. We are not using apertium-separable in spa-cat, and it
> >> will be useful to do it.
> >>
>
> I think it fits better in separable (rewriting multitokens into smaller
> units) than in CG (disambiguation).
>
> Another place could be in the bilingual dictionary, a special tag or a
> special
> lexeme, marking the missing que on the target side.
>
> e.g. in the monodix you could have:
>
> rogar¹:pregar
> rogar²:pregar
>
>
> Then in the bidix:
>
> rogar¹:pregar
> rogar²:pregar# que
>
> Then if a clean -cat was desired, a transfer rule could just insert a
> que
> when lemq was "# que".
>
> In fact, if we consider this stylism to be a different lexeme it kind of
> makes sense.
>
> Fran
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] ADDCOHORT in Constraint Grammar

2020-05-28 Thread Xavi Ivars
How would this fit in apertium-separable?

As far as I know the goal of apertium separable is to handle multi-words in
a better way than in the monodixes.

I totally get (and totally agree) that we should put in transfer only stuff
that is really about transfer between both languages and we don't want to
abuse it... But is that a good enough reason to abuse another module? Or
may it be the case that apertium-separable should handle a broader set of
use cases (and probably change its name)?


--
Xavi Ivars
< http://xavi.ivars.me >

El dj., 28 de maig 2020, 16:14, Jonathan Washington <
jonathan.n.washing...@gmail.com> va escriure:

> This could definitely be done in apertium-separable.  That would be by far
> the most straightforward way to solve this problem.  And if you did it as a
> language-specific lsx file as has been being discussed recently, it would
> serve the purpose you describe.
>
> Don't treat it as a structural transfer issue.  The less lexical stuff in
> transfer the better.
>
> --
> Jonathan
>
> On Thu, May 28, 2020, 06:47 Jaume Ortolà i Font 
> wrote:
>
>> Isn't this something that should go in transfer,
>>
>>
>> Dropping this "que" is possible in Spanish, but it is not regular syntax,
>> it is a mannerism used in bureaucratic jargon. The regular syntax is with
>> "que". It makes sense to add it, so all language pairs can translate as
>> usual.
>>
>> Transfer is extremely annoying for this kind of things, in my experience.
>>
>>
>>> or you could
>>> use apertium-separable for it?
>>>
>>
>> Probably yes. We are not using apertium-separable in spa-cat, and it will
>> be useful to do it.
>>
>> Jaume
>>
>> Fran
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] How useful is eliminating trimming for language developers?

2020-05-25 Thread Xavi Ivars
Hi,

I'm not sure if default trimming or default non-trimming should be the
right decision (probably, to be safer, a default-trimming approach would be
better as the starting point), but I want to bring up a few comments on
your list.

* In the trimming disadvantages number 1, we're stating that we're OK
having crappy monodixes because we *fix* that later on with trimming. I'm
sure that's where we are now, but as a project that focuses a lot on
provided free (as in speech) language resources that are later used for
many other use cases, I don't feel comfortable with that status. I think we
should aim to have as correct as possible dictionaries. And if we did that,
disadvantage number 1 would be smaller (even if not disappearing
completely).

* Advantadge number 2 is the main reason I would want to skip trimming in
some of the language pairs I'm more involved into. Hèctor can weight in, as
he's one a lot of work in these pairs but I can give as an example the
pairs por-cat and ita-cat. Basically, when we have a very good
monodix (because we have a very good language pair using it), not-trimming
could greatly improve much less developed language pairs using. As an
example, Catalan monodix is one of the best maintained monodixes we have as
a project (being mainly developed for spa-cat and, to a lesser extent,
fra-cat and eng-cat). There are a ton of proper and common nouns in
apertium-cat that will probably never be in por-cat or ita-cat bidixes, but
just by not trimming them, transfer rules would benefit greatly, and
translations would be much better because of that. So even if trimming is
off by default, I'd push to have it enabled for this type of pairs.

My two cents,

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Secondary Tag Prefixes

2020-05-10 Thread Xavi Ivars
m strem
> >> format.
> >>
> >> [0] <http://black.bikeshed.com/>
> >> [1] <https://wiki.apertium.org/wiki/List_of_symbols>
> >>
> >> --
> >> Regards, Flammie <https://flammie.github.io>
> >> (Please note, that I will often include my replies inline instead of
> >> top or bottom of the mail)
> >> ___
> >> Apertium-stuff mailing list
> >> Apertium-stuff@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
> There is already
> https://github.com/apertium/streamparser
>
> for Python...
>
> Fran
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Bylaws Overhaul Proposal

2020-04-30 Thread Xavi Ivars
Honestly, this seems like to me overcomplicating things a lot.

It is true that the bylaws need to account for things that may happen, but
in the 15 years that Apertium has existed as a project, I have never seen
any abuse of power, and trying to solve for problems that simply don't
exist I think is even counterproductive.

I wouldn't really focus too much on that.
--
Xavi Ivars
< http://xavi.ivars.me >

El dc., 29 d’abr. 2020, 18:56, Samuel Sloniker  va
escriure:

> Okay. Maybe at least have a group appointed by the PMC and confirmed by
> the Assembly that at least has the power to interpret the bylaws?
>
> On Wed, Apr 29, 2020 at 7:07 AM Tino Didriksen 
> wrote:
>
>> Any such secondary group would in practice be equally powerful as the
>> PMC, because they would need admin access to execute the auditing and
>> suspension. But because they would not be responsible for day-to-day
>> operations, they wouldn't be active to even spot patterns of abuse.
>>
>> So it would still be the PMC discovering something that needs to be acted
>> on immediately, and needing to consult another slow and potentially offline
>> party. It just doesn't work in practice.
>>
>> On top of that, it would further complicate elections.
>>
>> I recognize you want the three estates, but it's not practical.
>>
>> -- Tino Didriksen
>>
>>
>> On Wed, 29 Apr 2020 at 15:49, Samuel Sloniker 
>> wrote:
>>
>>> I am not suggesting the Assembly immediately do it. I am suggesting that
>>> at the time of each PMC election, the Assembly elect a separate group that
>>> would handle removals.
>>>
>>> On Tue, Apr 28, 2020 at 9:50 AM Tino Didriksen 
>>> wrote:
>>>
>>>> On Tue, 28 Apr 2020 at 18:27, Samuel Sloniker 
>>>> wrote:
>>>>
>>>>> Again, I believe the PMC should not be involved in removing Committer
>>>>> access, even temporarily. I think a separate elected group should do that.
>>>>>
>>>>
>>>> That simply can't work. If someone is actively abusing their access or
>>>> got hacked, we need to be able to immediately revoke access. Requiring
>>>> asking the Assembly up front is far too slow.
>>>>
>>>> -- Tino Didriksen
>>>>
>>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New release for apertium-fra-cat & por-cat

2020-03-30 Thread Xavi Ivars
In any case, we should probably further investigate that.

The locale format (language_country_variant) is not respected at the moment
because of the multiple underscores.


--
Xavi Ivars
< http://xavi.ivars.me >

El dl., 30 de març 2020, 5:38, Sushain Cherivirala  va
escriure:

> > The issue was only on a "_" that was used initially. I dropped it, but
> I didn't know that numbers too were not allowed. Thanks for the patching!
>
> Yeah, I think we didn't fully investigate your funky variant :) There was
> no real intentionality around not including numbers. I think including
> underscores could cause some downstream issues so thanks for dropping it!
>
> On Sun, Mar 29, 2020 at 1:47 PM Hèctor Alòs i Font 
> wrote:
>
>> Missatge de Sushain Cherivirala  del dia dg., 29 de
>> març 2020 a les 22:24:
>>
>>> Hèctor,
>>>
>>> > cat-por_PTpre1990
>>>
>>> I think that https://github.com/apertium/apertium-apy/issues/141 is
>>> relevant here. That particular variant isn't working since it has numbers
>>> in it. I hot-patched the regex and it seem to work now:
>>> https://www.apertium.org/apy/listPairs. Committed to master:
>>> https://github.com/apertium/apertium-apy/commit/2f391281532273cc2d229f645f0e5fdf30cfbe7f
>>> .
>>>
>>
>> The issue was only on a "_" that was used initially. I dropped it, but I
>> didn't know that numbers too were not allowed. Thanks for the patching!
>>
>> This also needed a tweak to the html-tools config to allow the variant.
>>> It looks like apertium.org picks it up now.
>>>
>>
>> Yes, it is working! Thanks!
>>
>>
>>> > By the way, "PTpre1990" stands for "European Portuguese (traditional
>>> orthography)". It should probably be added to the interface tags, but I
>>> don't know where is the file in github where the meaning should be added in
>>> several languages.
>>>
>>> You're looking for this file:
>>> https://github.com/apertium/apertium-apy/blob/master/language_names/variants.tsv
>>>
>>> Feel free to send a pull request my way. We'll have to upgrade APy to
>>> pick it up on the other end. I can cut a quick release for that after
>>> you've updated it.
>>>
>>> On Sat, Mar 28, 2020 at 12:06 PM Hèctor Alòs i Font <
>>> hectora...@gmail.com> wrote:
>>>
>>>> The new release of apertium-fra-cat is already available in
>>>> apertium.org. Thanks!
>>>> But I'm not sure the new version of apertium-por-cat is. At least modes
>>>> cat-por_BR and cat-por_PTpre1990 are not yet. Someone could take a look,
>>>> please?
>>>> By the way, "PTpre1990" stands for "European Portuguese (traditional
>>>> orthography)". It should probably be added to the interface tags, but I
>>>> don't know where is the file in github where the meaning should be added in
>>>> several languages.
>>>>
>>>> Hèctor
>>>>
>>>> Missatge de Hèctor Alòs i Font  del dia ds., 21
>>>> de març 2020 a les 0:07:
>>>>
>>>>> Thanks a lot, Tino!
>>>>>
>>>>> Missatge de Tino Didriksen  del dia dv., 20
>>>>> de març 2020 a les 23:13:
>>>>>
>>>>>> Finally got around to packaging fra-cat and por-cat.
>>>>>> https://github.com/apertium/apertium-packaging/issues/26 has links
>>>>>> to exact commits and release tags.
>>>>>>
>>>>>> Pushed to Debian:
>>>>>> - https://salsa.debian.org/science-team/apertium-fra-cat v1.8.0
>>>>>> - https://salsa.debian.org/science-team/apertium-pt-ca v0.10.0
>>>>>>
>>>>>> And upgraded public APy instance.
>>>>>>
>>>>>> -- Tino Didriksen
>>>>>>
>>>>>>
>>>>>> On Sat, 8 Feb 2020 at 10:23, Hèctor Alòs i Font 
>>>>>> wrote:
>>>>>>
>>>>>>> A new release of apertium-fra-cat is ready to be packaged.
>>>>>>>
>>>>>>> It mostly contains many new translations in the bidix (more than
>>>>>>> 15,000). Besides:
>>>>>>> - disambiguation has been improved, especially for French
>>>>>>> - dozens of new lexical selection rules have been added
>>>>>>> - dozens of new trans

Re: [Apertium-stuff] Modifying the apertium stream format to include arbitrary information

2020-03-29 Thread Xavi Ivars
y
(Mikel & Fran, but also Felipe, Sergio,...) to please keep challenging
approaches like this, but also by pointing out what are the weak points of
the proposal, so they can be reconsidered and improved, but not doing an
amendment to the entire proposal.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSoC--Apertium Website Development

2020-03-26 Thread Xavi Ivars
Don't you have an account already? Please go ahead and start working on it.

Missatge de Mohit Kumar Verma  del dia dj., 26 de març
2020 a les 16:37:

> I want to ask that where is the option to create draft proposal in the
> wiki?
>
> Thanks.
>
> On Thu, Mar 26, 2020 at 8:57 PM Xavi Ivars  wrote:
>
>> hi Mohit,
>>
>> The best thing is that you write a draft proposal in the wiki, and send a
>> link to the list, so mentors can discuss the proposal with you.
>>
>> If you said you have new ideas to implement, please add them to the
>> proposal as a starting point for the conversation.
>>
>> Xavi
>>
>> Missatge de Mohit Kumar Verma  del dia dj., 26 de
>> març 2020 a les 14:17:
>>
>>> Hi
>>> I wanted to ask that what do you think can be accomplished in the
>>> project idea: Improvements to the Apertium Website.
>>> The more I think, the more I get new ideas to implement but they are
>>> just too much to be done in 3 month period.
>>> Can you please suggest how many tasks and what type of tasks I should
>>> include in the timeline.
>>>
>>> Thanks
>>> Mohit
>>>
>>>
>>>
>>> On Thu, Mar 26, 2020 at 3:49 PM Shrey Modi 
>>> wrote:
>>>
>>>> Hey Mohit
>>>> For review send it in the mailing list.
>>>>
>>>> All The Best
>>>> Shrey Modi
>>>>
>>>> On Thu, 26 Mar 2020 at 14:03, Mohit Kumar Verma 
>>>> wrote:
>>>>
>>>>> Hi
>>>>> My GSoC proposal is ready. I want to send it for a review before
>>>>> putting it on the GSoC website. Where should I send it?
>>>>>
>>>>> Thanks
>>>>> Mohit
>>>>>
>>>>> On Tue, Mar 24, 2020 at 6:35 PM Tino Didriksen 
>>>>> wrote:
>>>>>
>>>>>> "A randomly generated password for Yaimgr8 has been sent to
>>>>>> yaim...@gmail.com."
>>>>>>
>>>>>> -- Tino Didriksen
>>>>>>
>>>>>>
>>>>>> On Tue, 24 Mar 2020 at 13:21, Mohit Kumar Verma 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>> I am interested in project: Apertium Website Development and I will
>>>>>>> send proposal for it.
>>>>>>> I am requesting for wiki account.
>>>>>>> username:  yaimgr8
>>>>>>> email id: yaim...@gmail.com
>>>>>>>
>>>>>> ___
>>>>>> Apertium-stuff mailing list
>>>>>> Apertium-stuff@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>>>
>>>>> ___
>>>>> Apertium-stuff mailing list
>>>>> Apertium-stuff@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>>
>>>> ___
>>>> Apertium-stuff mailing list
>>>> Apertium-stuff@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>>
>>
>> --
>> < Xavi Ivars >
>> < http://xavi.ivars.me >
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSoC--Apertium Website Development

2020-03-26 Thread Xavi Ivars
hi Mohit,

The best thing is that you write a draft proposal in the wiki, and send a
link to the list, so mentors can discuss the proposal with you.

If you said you have new ideas to implement, please add them to the
proposal as a starting point for the conversation.

Xavi

Missatge de Mohit Kumar Verma  del dia dj., 26 de març
2020 a les 14:17:

> Hi
> I wanted to ask that what do you think can be accomplished in the project
> idea: Improvements to the Apertium Website.
> The more I think, the more I get new ideas to implement but they are just
> too much to be done in 3 month period.
> Can you please suggest how many tasks and what type of tasks I should
> include in the timeline.
>
> Thanks
> Mohit
>
>
>
> On Thu, Mar 26, 2020 at 3:49 PM Shrey Modi  wrote:
>
>> Hey Mohit
>> For review send it in the mailing list.
>>
>> All The Best
>> Shrey Modi
>>
>> On Thu, 26 Mar 2020 at 14:03, Mohit Kumar Verma 
>> wrote:
>>
>>> Hi
>>> My GSoC proposal is ready. I want to send it for a review before putting
>>> it on the GSoC website. Where should I send it?
>>>
>>> Thanks
>>> Mohit
>>>
>>> On Tue, Mar 24, 2020 at 6:35 PM Tino Didriksen 
>>> wrote:
>>>
>>>> "A randomly generated password for Yaimgr8 has been sent to
>>>> yaim...@gmail.com."
>>>>
>>>> -- Tino Didriksen
>>>>
>>>>
>>>> On Tue, 24 Mar 2020 at 13:21, Mohit Kumar Verma 
>>>> wrote:
>>>>
>>>>> Hi
>>>>> I am interested in project: Apertium Website Development and I will
>>>>> send proposal for it.
>>>>> I am requesting for wiki account.
>>>>> username:  yaimgr8
>>>>> email id: yaim...@gmail.com
>>>>>
>>>> ___
>>>> Apertium-stuff mailing list
>>>> Apertium-stuff@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium PMC elections

2020-03-17 Thread Xavi Ivars
Do bylaws say something about being a over-18?

I don't think it's explicit anywhere, but it would be something we should
take into account.

Also, I don't think temporary resignations (getting off the PMC while GSoC
or GCI are going on and then coming back) is a good idea: that seems to my
trying to find a workaround to literally follow the rules but in practice
breaking them (just to be clear: I'm not taking about "holding" on someone
joining the PMC until their GSoC/GCI finish, but getting in/out/in the
again).


--
Xavi Ivars
< http://xavi.ivars.me >

El dt., 17 de març 2020, 16:46, Scoop Gracie  va
escriure:

> Oh, okay. There's a similar scenario with GCI; I want to be a student
> again this year. Could we keep a vacancy on the PMC from the start of GCI
> until it ended?
>
> On Tue, Mar 17, 2020 at 8:40 AM Daniel Swanson 
> wrote:
>
>> Given that GSoC applications have already begun, I think that would
>> amount to the same thing.
>>
>> On Tue, Mar 17, 2020 at 11:38 AM Scoop Gracie 
>> wrote:
>>
>>> Alternatively, would it be possible for  him to be appointed now,
>>> temporarily resign before GSoC, and then be reappointed after GSoC ends?
>>>
>>> On Tue, Mar 17, 2020 at 7:29 AM Tino Didriksen 
>>> wrote:
>>>
>>>> Tanmai Khanna is both standing for PMC and is considering participating
>>>> as a student in GSoC for Apertium. These are incompatible - GSoC rules
>>>> state students cannot be officers of the organization they want to work 
>>>> for.
>>>>
>>>> I propose that if Tanmai Khanna is both elected and applies and is
>>>> accepted for GSoC, that we postpone his appointment to the PMC until final
>>>> GSoC evaluations are done. We discussed it a bit in the mentors IRC 
>>>> channel.
>>>>
>>>> Just one more quirk to add to the mix.
>>>>
>>>> -- Tino Didriksen
>>>>
>>>>
>>>> On Tue, 17 Mar 2020 at 09:33, Hèctor Alòs i Font 
>>>> wrote:
>>>>
>>>>> According to article 23 of the by-laws,
>>>>>
>>>>> "f) When an election is called, the Election Board will publish a
>>>>> temporary census of Committers with right to vote will be published.
>>>>> g) After 7 days to amend the census, a definitive census of Committers
>>>>> with right to vote will be published by the Election Board
>>>>> h) There will be 7 days for candidates to member and president to come
>>>>> forward
>>>>> i) The Election Board will proclaim the candidates to President and to
>>>>> Members of the Project Management Committee"
>>>>>
>>>>> On the one hand, a list of potential voters has been kindly published
>>>>> by Tino Didriksen here:
>>>>>
>>>>> https://docs.google.com/spreadsheets/d/1ECL_8Lkfx4A66xpHhbOTn7ljKoDcLa0w7MdFZC7DOpA/edit#gid=0
>>>>>
>>>>> The Election Board has not found any missing, and considers it a
>>>>> temporary census. If someone thinks there is a mistake, and someone is
>>>>> missing or left over, please contact the Election Board. The definitive
>>>>> census will be published after 7 days.
>>>>>
>>>>> On the other hand, so far, these people have indicated they want to
>>>>> run for PMC members:
>>>>>
>>>>> - Tino Didriksen
>>>>> - Scoop Gracie (pseudonym)
>>>>> - Xavi Ivars
>>>>> - Tanmai Khanna
>>>>> - Mikel L. Forcada
>>>>> - Francis Tyers
>>>>> - Jonathan Washington
>>>>>
>>>>> These are standing for PMC President:
>>>>>
>>>>> - Tino Didriksen (if there are more than 7 candidates for PMC members)
>>>>> - Francis Tyers
>>>>>
>>>>> If someone wants to stand as a candidate, there is still a week to do
>>>>> so.
>>>>>
>>>>> The Election Board
>>>>> Hèctor Alòs i Font
>>>>> Sevilay Bayatlı
>>>>> Daniel Swanson
>>>>>
>>>> ___
>>>> Apertium-stuff mailing list
>>>> Apertium-stuff@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium elections coming up.

2020-03-10 Thread Xavi Ivars
I'm also interested on running for the PMC.

Should we try to establish some dates?

Missatge de Tino Didriksen  del dia ds., 7 de març
2020 a les 13:35:

> Turns out my output definitely needed checking. In my quest to sync
> outside-collaborators, I forgot to double-check the direct members.
>
> Using https://gist.github.com/jdennes/11404512#gistcomment-2734613 I've
> now exported https://github.com/orgs/apertium/people and added the ones
> who where missing and have visible emails to the list. Also added ayushjain.
>
> So,
> https://docs.google.com/spreadsheets/d/1ECL_8Lkfx4A66xpHhbOTn7ljKoDcLa0w7MdFZC7DOpA
> is now up to 316 names, of which 314 have emails.
>
> -- Tino Didriksen
>
>
> On Fri, 6 Mar 2020 at 21:55, AJ  wrote:
>
>> Hi
>>
>> Can I be added to the list(spreadsheet) or know why can't I take part? I
>> am a GCI mentor and interested in taking part in activities of Apertium.
>>
>> Thank you
>>
>> Ayush Jain
>> IRC nick: ayushjain
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium GSoC 2020? Deadline Feb 5th

2020-02-03 Thread Xavi Ivars
Hi,

I'd be happy to help for GSoC, and happy to mentor any website and romanic
language pair related project.

I knowalready Gianfranco Fronteddu (he was a GSoC student few years ago,
with Sardinian-related pairs) wants to be mentor as well.



Missatge de Amr Mohamed Hosny Anwar  del dia dt.,
28 de gen. 2020 a les 23:07:

> On 1/28/20 6:37 PM, Francis Tyers wrote:
> > El 2020-01-28 15:08, Tino Didriksen escribió:
> >> https://summerofcode.withgoogle.com/ is open for organization
> >> applications, until February 5th.
> >>
> >> Are we participating in 2020? Who's up for mentoring and
> >> administrating?
> >>
> >> Previous years at this stage we've at least talked about it, but this
> >> year it's been rather silent. Students are as usual already looking at
> >> the ideas page and asking how to get started.
> >>
> >> -- Tino Didriksen
> >
> > I'm potentially willing to mentor this summer, but I cannot be an admin
> > this year due to prior commitments. I hope to be able to resume next
> > year!
> >
> > Fran
> >
> >
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> As a GSoC student with Apertium, I really liked the experience and I
> believe it's a great opportunity for university students to contribute
> to Apertium.
> I can help with tasks like archiving ideas that were already done in
> GSoC 2019.
>
> Amr
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Idiom translation module

2020-01-17 Thread Xavi Ivars
Missatge de Tanmai Khanna  del dia dj., 16 de gen.
2020 a les 12:09:

> "Is there anything this wouldn't cover?"
>
> Idioms may be contiguous but need not be frozen. They can take arguments
> and can be modified based on TAM and GNP.
>

> But that can be handled by separable if I'm not wrong. So yeah, seems like
> we have this handled. As constructions get larger and more complex, I'm not
> so sure.
>


Both TAM and GNP can also be handled directly by multiwords in standard
apertium dictionaries.

http://wiki.apertium.org/wiki/Multiwords
-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Idiom translation module

2020-01-16 Thread Xavi Ivars
And, actually, something like this doesn't even need apertium-separable.

Apertium dictionaries already support multiwords when all words are
contiguous (apertium-separable handles the case when that is not possible).
And I guess idioms are contiguous in most cases.

There's nothing preventing right now to add "pig in a poke" as a multiworld
to the english monodix, "gato por liebre" in the spanish monodix, and then
do the mapping in the bidix.

Is there anything this wouldn't cover?

Missatge de Scoop Gracie  del dia dj., 16 de gen.
2020 a les 4:52:

> Oh, ok. So basically, this would be a duplicate of -separable?
>
> On Wed, Jan 15, 2020 at 7:48 PM Daniel Swanson 
> wrote:
>
>> Apertium-separable matches phrases and combines them so that the
>> dictionary can treat them as single words. It's called "separable" because
>> these phrases do not have to be contiguous, so it can convert ^take$ ^the$
>> ^trash$ ^out$ into something more like ^take# out$ ^the$ ^trash$.
>>
>> In your case, it could convert ^pig$ ^in$ ^a$ ^poke$ to ^pig# in# a#
>> poke$ which could then just be an entry in the bilingual dictionary with
>> the other side being ^gato# por# liebre$.
>>
>> http://wiki.apertium.org/wiki/Apertium-separable
>>
>> On Wed, Jan 15, 2020 at 10:09 PM Scoop Gracie 
>> wrote:
>>
>>> What does -separable do?
>>>
>>> On Wed, Jan 15, 2020 at 6:48 PM Daniel Swanson <
>>> awesomeevildu...@gmail.com> wrote:
>>>
>>>> I'm not sure we actually need a new module for this. It might be
>>>> possible to deal with all of your examples using apertium-separable. And
>>>> failing that, a transfer rule could be written for any particular idiom.
>>>>
>>>> On Wed, Jan 15, 2020 at 8:15 PM Scoop Gracie 
>>>> wrote:
>>>>
>>>>> I've created a proposal for a new module.
>>>>> https://docs.google.com/document/d/1hdOAc-wyl3cwoSsv2M3dg8h_N_gphzkBOr_fPcg6Ss8/edit?usp=sharing
>>>>> ___
>>>>> Apertium-stuff mailing list
>>>>> Apertium-stuff@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>>
>>>> ___
>>>> Apertium-stuff mailing list
>>>> Apertium-stuff@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Updating the Apertium 2-page brochure

2019-11-06 Thread Xavi Ivars
I like the idea of having all these materials in a "promotional-materials"
repo (even if not sure if it should go to a folder inside
https://github.com/apertium/organisation instead)

Missatge de Mikel L. Forcada  del dia dt., 5 de nov. 2019 a
les 11:28:

> Agreed! And if it is generated from source (e.g. dot), the source should
> also be somewhere there.
>
> The same with the block diagram.
>
> Mikel
>
>
> El 5/11/19 a les 16:53, Jonathan Washington ha escrit:
> > Hi Mikel,
> >
> > I've been thinking recently that we also should put the Apertium SVG
> > logo in the repos somewhere.  I wonder if this sort of thing should be
> > grouped together (maybe in a "promotional materials" repo?) or
> > separately.
> >
> > What do other people familiar with Apertium's GitHub organisation think?
> >
> > Also consider this overarching organisational concept:
> > https://apertium.github.io/apertium-on-github/source-browser.html
> >
> > --
> > Jonathan
> >
> > пн, 4 нояб. 2019 г. в 08:19, Mikel L. Forcada :
> >> Dear all,
> >>
> >> Next month I will be travelling to the LT4All meeting (
> https://lt4all.elra.info/en/), where Apertium is a sponsor (
> https://lt4all.elra.info/en/sponsors/sponsors/).
> >>
> >> Attached goes a brochure in flat ODT  (.fodt, text file, nicer for
> versioning) which is probably 2 years old and needs updating.
> >>
> >> We would have to:
> >>
> >> Push this somewhere in GitHub.com/apertium, but where?
> >> Update it (there are two graphs there).
> >>
> >> I'd appreciate it very much if:
> >>
> >> Someone could push it in our repositories and tell us where, so that I
> can start playing with it, and
> >> People who know apertium better let me know of obvious changes that
> need to be done (bearing in mind that this has to be two pages).
> >>
> >> Thanks a million!
> >>
> >> All the best,
> >>
> >> Mikel
> >>
> >> --
> >>   Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
> >> Departament de Llenguatges i Sistemes Informàtics
> >> Universitat d'Alacant
> >> E-03071 Alacant, Spain
> >> Phone: +34 96 590 9776
> >> Fax: +34 96 590 9326
> >>
> >> ___
> >> Apertium-stuff mailing list
> >> Apertium-stuff@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> >
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
> --
> Mikel L. Forcada  http://www.dlsi.ua.es/~mlf/
> Departament de Llenguatges i Sistemes Informàtics
> Universitat d'Alacant
> E-03690 Sant Vicent del Raspeig
> Spain
> Office: +34 96 590 9776
>
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] odt translation not working

2019-10-22 Thread Xavi Ivars
I found the issue: it's due to a bug introduced last May (), while doing
some improvements to the `apertium` main script.

I've pushed a PR that should fix it:
https://github.com/apertium/apertium/pull/60

Missatge de Xavi Ivars  del dia dl., 21 d’oct. 2019 a
les 21:16:

> I found out where the problem is: postgeneration (lt-proc -p) is blowing
> up (with a core dumped error).
>
> Up to that point, translation works well.
>
> This is an output example of the different steps:
> https://gist.github.com/xavivars/196f97454427c48cf12b966bc3281663
>
> It would be great if anyone could take a look.
>
> Missatge de Jaume Ortolà i Font  del dia dl., 21
> d’oct. 2019 a les 18:39:
>
>> Hi,
>>
>> I'm using the nightly version of Apertium, and the translation of ODT
>> files is not working.
>>
>> This does nothing. The result is the same file unchanged:
>> $apertium -d . -f odt spa-cat test.odt test-cat.odt
>>
>> It seems to happen with any language pair and only with ODT files.
>>
>> I'm not the only one experiencing this issue. Other people pointed it out
>> to me.
>>
>> Jaume
>>
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] odt translation not working

2019-10-21 Thread Xavi Ivars
I found out where the problem is: postgeneration (lt-proc -p) is blowing up
(with a core dumped error).

Up to that point, translation works well.

This is an output example of the different steps:
https://gist.github.com/xavivars/196f97454427c48cf12b966bc3281663

It would be great if anyone could take a look.

Missatge de Jaume Ortolà i Font  del dia dl., 21
d’oct. 2019 a les 18:39:

> Hi,
>
> I'm using the nightly version of Apertium, and the translation of ODT
> files is not working.
>
> This does nothing. The result is the same file unchanged:
> $apertium -d . -f odt spa-cat test.odt test-cat.odt
>
> It seems to happen with any language pair and only with ODT files.
>
> I'm not the only one experiencing this issue. Other people pointed it out
> to me.
>
> Jaume
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] genv*dix.py need conversion

2019-09-12 Thread Xavi Ivars
Forget about it, that one has always been python3. I just got confused with
the genv*dix scripts

El dv., 13 de set. 2019, 12:45, Xavi Ivars  va
escriure:

> El dv., 13 de set. 2019, 12:28, Tino Didriksen 
> va escriure:
>
>> [...]
>>
>> I think only -por-cat uses both metalrx and metalrx-to-lrx, so if you
>> want to merge their functionality it should be harmless enough.
>>
>
> fra-cat (the pair I initially implemented the python script for) also uses
> both.
>
> https://github.com/apertium/apertium-fra-cat/blob/master/Makefile.am#L122
>
> What I'm not sure is why it seems the python script in there is already
> python3 compatible
> --
> Xavi Ivars
> < http://xavi.ivars.me >
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] genv*dix.py need conversion

2019-09-12 Thread Xavi Ivars
El dv., 13 de set. 2019, 12:28, Tino Didriksen 
va escriure:

> [...]
>
> I think only -por-cat uses both metalrx and metalrx-to-lrx, so if you want
> to merge their functionality it should be harmless enough.
>

fra-cat (the pair I initially implemented the python script for) also uses
both.

https://github.com/apertium/apertium-fra-cat/blob/master/Makefile.am#L122

What I'm not sure is why it seems the python script in there is already
python3 compatible
--
Xavi Ivars
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] genv*dix.py need conversion

2019-09-12 Thread Xavi Ivars
El dv., 13 de set. 2019, 1:27, Kevin Brubeck Unhammer 
va escriure:

>
> So it seems the python script
> https://github.com/apertium/apertium/blob/master/scripts/apertium-metalrx
> lets you do  with templates like {{pfoo}}:
>
> https://github.com/apertium/apertium-por-cat/blob/master/apertium-por-cat.cat-por.metalrx#L5
> that the caller can replace:
>
> https://github.com/apertium/apertium-por-cat/blob/master/apertium-por-cat.cat-por.metalrx#L3223
>
> Perhaps
>
> https://github.com/apertium/apertium/blob/master/scripts/apertium-metalrx-to-lrx.in
> should call both the XSLT and the python script? It seems they should be
> compatible, doing nothing if the special features aren't used. (If so,
> scripts/apertium-metalrx should probably be named somethingelse.py.)
>

Yes, I built that python script as a quick hack to see if that could be
done, and never really improved or documented it. My bad.

Originally, I tried extending somehow the xslt, but I after spending too
much time figuring it out, I decided to go with a simple python script.

It's compatible with the xslt: if I remember correctly, it needs to be
applied before, but that's about it in terms of requirements.

Personally, I'd also bundle xslt logic into the same script. I think python
is way more readable than xslt, but didn't do that either for several
reasons:
* xslt "was already there". It existed in the past, and whoever implemented
that in xslt had (probably) a good reason to
* new python dependency: didn't want to introduce python dependency to all
packages that used to use the xslt script
* consistency across pairs: if python was not added everywhere, the
alternative would be to have some pairs with that logic being done via the
xslt script and some others via the python one

But I think now it's a good opportunity to consolidate, decide where we
want to go (I'd vote for python helper scripts), and try to document it.

--
Xavi Ivars
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] genv*dix.py need conversion

2019-09-11 Thread Xavi Ivars
I won't be able to do anything before September 20th (on vacation, without
computer around).

Once I get there, I'll change the scripts. Not sure how to do it so they
"become" part of the apertium package, so I'll try to do out in multiple
steps: first convert to python3 and then, move out.

Would it be better to create a python package to contain these type of
scripts, instead of bundling everything into apertium?


--
Xavi Ivars

El dc., 11 de set. 2019, 22:02, Tino Didriksen 
va escriure:

> The python2 scripts genvdix.py genvrdix.py genvldix.py are no longer
> usable, since python2 is dead and gone. They must be converted to python3
> and Makefile.am adjusted.
>
> Ideally remove them entirely and make them part of apertium so they can be
> installed and maintained centrally. Add shebang and install as
> apertium-genvdix, etc.
>
> Affecting at least:
> https://github.com/apertium/apertium-cat
> https://github.com/apertium/apertium-srd
> https://github.com/apertium/apertium-por
> https://github.com/apertium/apertium-por-cat
> https://github.com/apertium/apertium-eng-cat
> https://github.com/apertium/apertium-arg-cat
> https://github.com/apertium/apertium-spa-cat
>
> I cannot push anything to Debian that relies on python2.
>
> -- Tino Didriksen
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-fra-cat new version

2019-03-15 Thread Xavi Ivars
Missatge de Hèctor Alòs i Font  del dia dv., 15 de
març 2019 a les 10:13:

> Missatge de Kevin Brubeck Unhammer  del dia dv., 15 de
> març 2019 a les 12:08:
>
>>
>>
>> Xavi has a fix on the way for the problem, btw:
>>
>> https://github.com/apertium/apertium-separable/issues/1#issuecomment-471121936
>> but apertium-separable needs a new release I suppose
>>
>>
> I discussed with @Xavi Ivars   after this fix
> whether it'd be better to include it or drop the call to apertium
> separable, since it is not sure than the fix really solves all the
> problems. He recommended the second.
>

Sorry for the misunderstanding. I actually think this fix will the problem,
so I wasn't suggesting to remove apertium-separable, but it's just that I'm
not 100% sure. But if you've already removed apertium-separable, it's OK.
Let's add it back to fra-cat after the release, to monitor that really the
problem has ben solved.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] French to Catalan is not working

2019-03-09 Thread Xavi Ivars
This bug was pretty annoying, as it was making the pair unusable, and *I
think *I finally fixed it after several attempts (even if I'd love another
pair of eyes checking the solution).

Can anyone confirm it works with the latest code changes?

Missatge de Kevin Brubeck Unhammer  del dia dc., 6 de
març 2019 a les 20:28:

> Hèctor Alòs i Font 
> čálii:
>
> > Thanks, Tino. Could we know which is the module that crashes and, even
> > better, at least one translation which makes it crash?
>
> I wrote that here:
> https://sourceforge.net/p/apertium/mailman/message/36588699/
> The quick fix is to make a version of the translator that doesn't use
> lsx-proc.
>
> The better solution is to fix NUL flushing in apertium-separable.
>
> GSoC applicants who wish to prove their C++ skills look here:
>
> https://github.com/apertium/apertium-separable/issues/1#issuecomment-464338745
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apy/translate falla amb lt-proc -x

2019-02-02 Thread Xavi Ivars
Per cert, ja he pujat el canvi del lttoolbox al github d'apertium.

Falta corregir l'errada d'APy. Ara m'ho miraré i ho pujaré també!

Missatge de Xavi Ivars  del dia ds., 2 de febr. 2019
a les 19:37:

> Té tot el sentit del món. Amb el mode -z, el que es permet és que el
> programa estiga "funcionant" i s'envien múltiples traduccions, sense
> necessitat d'executar el programa de nou. I s'utilitza el \0 per a indicar
> el final de la traducció.
>
> Però en este cas, com que s'estava enviant el caràcter duplicat, no
> funcionava com calia.
>
> Moltíssimes gràcies per trobar l'errada, Joan!
>
>
>
> Missatge de Joan Moratinos Jaume  del dia dc., 30
> de gen. 2019 a les 21:06:
>
>> He fet un canvi que pareix que funciona bé. He posat una condició per no
>> copiar el \0 original. Aquest és el resultat de 'git diff':
>> --- a/lttoolbox/fst_processor.cc
>> +++ b/lttoolbox/fst_processor.cc
>> @@ -2154,7 +2154,8 @@ FSTProcessor::intergeneration(FILE *input, FILE
>> *output)
>>  {
>>fputwc_unlocked(L'\\', output);
>>  }
>> -    fputwc_unlocked(val, output);
>> +if (val != L'\0')
>> +  fputwc_unlocked(val, output);
>>}
>>  }
>>  else
>>
>>
>> On Wed, 30 Jan 2019 at 11:30, Xavi Ivars  wrote:
>>
>>>
>>>
>>> Missatge de Joan Moratinos Jaume  del dia dt., 29
>>> de gen. 2019 a les 20:19:
>>>
>>>> Les versions que tenc de lttoolbox, apertium i apy són les més actuals.
>>>> Apy me funciona perfectament, a no ser que s'inclogui la passa "lt-proc -x
>>>> -z".
>>>> Estàs segur que a SoftCatalà feu servir aquesta passa? He vist que
>>>> tradueix "Dame la mano." per "Dóna'm la mà", amb els diacrítics vells.
>>>>
>>>
>>> Tens raó, encara no utilitzem eixa versió. De tota manera, la idea és
>>> oferir la possibilitat d'utilitzar tant els diacrítics vells com els nous
>>> (per això no hem inclòs directament els nous en el mode normal).
>>>
>>>
>>>> He aconseguit veure quin és el problema i l'he arreglat a la meva
>>>> màquina però no deu ser una solució definitiva. "lt-proc -x -z" copia el
>>>> caràcter '\0' de l'entrada (si n'hi ha) i en genera un altre. Aquesta
>>>> duplicació és la causa del problema. Anul·lant la línia que envia '\0' a la
>>>> sortida, la cosa va com una seda.
>>>>
>>>>
>>> Possiblement si que és la solució definitiva, si el que passava és que
>>> s'estaven passant dos \0. Només n'hauria de passar un. Però quan vaig
>>> implementar el mode fa uns mesos, no vaig saber veure exactament on passava
>>> el \0 original. Tu ho has vist?
>>>
>>> --
>>> < Xavi Ivars >
>>> < http://xavi.ivars.me >
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>>
>>
>> --
>> Joan Moratinos
>> jmorati...@gmail.com
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apy/translate falla amb lt-proc -x

2019-02-02 Thread Xavi Ivars
Té tot el sentit del món. Amb el mode -z, el que es permet és que el
programa estiga "funcionant" i s'envien múltiples traduccions, sense
necessitat d'executar el programa de nou. I s'utilitza el \0 per a indicar
el final de la traducció.

Però en este cas, com que s'estava enviant el caràcter duplicat, no
funcionava com calia.

Moltíssimes gràcies per trobar l'errada, Joan!



Missatge de Joan Moratinos Jaume  del dia dc., 30 de
gen. 2019 a les 21:06:

> He fet un canvi que pareix que funciona bé. He posat una condició per no
> copiar el \0 original. Aquest és el resultat de 'git diff':
> --- a/lttoolbox/fst_processor.cc
> +++ b/lttoolbox/fst_processor.cc
> @@ -2154,7 +2154,8 @@ FSTProcessor::intergeneration(FILE *input, FILE
> *output)
>  {
>fputwc_unlocked(L'\\', output);
>  }
> -fputwc_unlocked(val, output);
> +if (val != L'\0')
> +      fputwc_unlocked(val, output);
>}
>  }
>  else
>
>
> On Wed, 30 Jan 2019 at 11:30, Xavi Ivars  wrote:
>
>>
>>
>> Missatge de Joan Moratinos Jaume  del dia dt., 29
>> de gen. 2019 a les 20:19:
>>
>>> Les versions que tenc de lttoolbox, apertium i apy són les més actuals.
>>> Apy me funciona perfectament, a no ser que s'inclogui la passa "lt-proc -x
>>> -z".
>>> Estàs segur que a SoftCatalà feu servir aquesta passa? He vist que
>>> tradueix "Dame la mano." per "Dóna'm la mà", amb els diacrítics vells.
>>>
>>
>> Tens raó, encara no utilitzem eixa versió. De tota manera, la idea és
>> oferir la possibilitat d'utilitzar tant els diacrítics vells com els nous
>> (per això no hem inclòs directament els nous en el mode normal).
>>
>>
>>> He aconseguit veure quin és el problema i l'he arreglat a la meva
>>> màquina però no deu ser una solució definitiva. "lt-proc -x -z" copia el
>>> caràcter '\0' de l'entrada (si n'hi ha) i en genera un altre. Aquesta
>>> duplicació és la causa del problema. Anul·lant la línia que envia '\0' a la
>>> sortida, la cosa va com una seda.
>>>
>>>
>> Possiblement si que és la solució definitiva, si el que passava és que
>> s'estaven passant dos \0. Només n'hauria de passar un. Però quan vaig
>> implementar el mode fa uns mesos, no vaig saber veure exactament on passava
>> el \0 original. Tu ho has vist?
>>
>> --
>> < Xavi Ivars >
>> < http://xavi.ivars.me >
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
>
> --
> Joan Moratinos
> jmorati...@gmail.com
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apy/translate falla amb lt-proc -x

2019-01-30 Thread Xavi Ivars
Missatge de Joan Moratinos Jaume  del dia dt., 29 de
gen. 2019 a les 20:19:

> Les versions que tenc de lttoolbox, apertium i apy són les més actuals.
> Apy me funciona perfectament, a no ser que s'inclogui la passa "lt-proc -x
> -z".
> Estàs segur que a SoftCatalà feu servir aquesta passa? He vist que
> tradueix "Dame la mano." per "Dóna'm la mà", amb els diacrítics vells.
>

Tens raó, encara no utilitzem eixa versió. De tota manera, la idea és
oferir la possibilitat d'utilitzar tant els diacrítics vells com els nous
(per això no hem inclòs directament els nous en el mode normal).


> He aconseguit veure quin és el problema i l'he arreglat a la meva màquina
> però no deu ser una solució definitiva. "lt-proc -x -z" copia el caràcter
> '\0' de l'entrada (si n'hi ha) i en genera un altre. Aquesta duplicació és
> la causa del problema. Anul·lant la línia que envia '\0' a la sortida, la
> cosa va com una seda.
>
>
Possiblement si que és la solució definitiva, si el que passava és que
s'estaven passant dos \0. Només n'hauria de passar un. Però quan vaig
implementar el mode fa uns mesos, no vaig saber veure exactament on passava
el \0 original. Tu ho has vist?

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-separable (was Re: Current GSOC ideas)

2019-01-28 Thread Xavi Ivars
eng-cat, for example, is also using apertium-separable before bilingual:

https://github.com/apertium/apertium-eng-cat/blob/master/modes.xml

Marc worked on integrating that module in the pipeline.

The main difference seems to be that none of these pairs (fra-cat,
eng-cat,...) use a reversed compiled dictionary, so we may have not faced
the problems you have.
--
Xavi Ivars
< http://xavi.ivars.me >


El dt., 29 de gen. 2019, 06:53, Hèctor Alòs i Font 
va escriure:

> Missatge de Jonathan Washington  del dia
> dt., 29 de gen. 2019 a les 6:37:
>
>>
>> пн, 28 янв. 2019 г. в 15:32, Hèctor Alòs i Font :
>>
>>> Missatge de Francis Tyers  del dia dl., 28 de gen.
>>> 2019 a les 21:13:
>>>
>>>>
>>>> There is a lot of functionality that is not used widely that could be
>>>> really
>>>> used to improve performance of language pairs.
>>>>
>>>> * apertium-separable
>>>> * weights in lttoolbox
>>>> * weighted transfer
>>>>
>>>
>>> I'd like to try them. I have already used apertium-separable en fra-cat
>>> and I really helps,
>>>
>>
>> Saluton Hèctor,
>>
>> I'm curious how you fared with apertium-separable.  I have a specific
>> question about it:
>>
>> What is your pipeline now?  My imagined pipeline is something like this:
>> [xxx tagger] → [xxx-yyy lsx] → [xxx-yyy dix] → [lrx] → [t*x] →
>> [yyy-xxx.reverse lsx] → [yyy generator]
>>
>> With this pipeline, you have to reverse-compile (in theory supported) the
>> apertium-seperable transducer for the opposite direction and include that
>> in your pipeline before morphological generation.
>>
>> However, dolphingarlic and I were having trouble getting that working in
>> eng-deu.  Our work is in this branch:
>> https://github.com/apertium/apertium-eng-deu/tree/separable-words
>>
>
> The call to the apertium-separable module stands in two different places,
> depending on the side of the translation.
>
> For fra2cat it is placed just after pretransfer, before lexical selection
> and transfer. But in cat2fra it is like you say: between transfer and
> generation. (
> https://github.com/apertium/apertium-fra-cat/blob/master/modes.xml ).
>
> In the fra-cat pair apertium-separable is used for multiword verbal
> expressions (like e.g. "avoir lieu") defined in the dictionaries. The
> problem is in the negative forms, e.g. "n'a pas lieu". In fra2cat it is
> changed to "n'a lieu pas" before it is searched in the dictionary. In
> cat2fra the usual pipeline gives something like "ne
> avoir lieu pas", so after transfer
> "pas" is reordered and just after "n'a pas lieu" is generated.
>
> You can take a look on
> https://github.com/apertium/apertium-fra-cat/blob/master/apertium-fra-cat.fra-cat.l1x
> and
> https://github.com/apertium/apertium-fra-cat/blob/master/apertium-fra-cat.cat-fra.l2x
>
> Hèctor
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apy/translate falla amb lt-proc -x

2019-01-27 Thread Xavi Ivars
Actualitzant lttoolbox i apertium ja et funciona bé el que comentaves de
que no tornava cap traducció o la primera? Eixe problema s'ha resolt?


--
Xavi Ivars
< http://xavi.ivars.me >

El ds., 26 de gen. 2019, 21:26, Joan Moratinos Jaume 
va escriure:

> Aquí tens els petits retocs per a apy. Són per a permetre [0-9_] dins els
> noms dels modes i per a llegir la variant completa.
>
> On Sat, 26 Jan 2019 at 12:25, Xavi Ivars  wrote:
>
>>
>>
>> Missatge de Joan Moratinos Jaume  del dia ds., 26
>> de gen. 2019 a les 9:42:
>>
>>> Estic provant apy/translate amb el nous parells spa-cat, que inclouen
>>> "lt-proc -x '.../apertium-spa-cat/spa-cat.autopgen-diacritics-*.bin'". La
>>> primera vegada, funciona. Després, torna una traducció en blanc o la
>>> primera traducció, encara que els arguments siguin diferents.Eliminant
>>> "lt-proc -x", la cosa torna a funcionar.
>>> Incidentalment, cal fer alguns petits retocs perquè apy reconegui noms
>>> de mode com "spa-cat_valencia_iec2017".
>>>
>>>
>> Hola Joan,
>>
>> Si lleves "lt-proc -x", la cosa no et funcionarà com cal: eixe mode fa un
>> pre-procés abans de la post-generació i, si no es fa, genera resultats no
>> desitjats precisament amb la postgeneració.
>>
>> El que passa és que l'última versió estable dels paquets lttoolbox i
>> apertium no tenen l'última versió del "lt-proc -x". Però supose que si
>> estàs provant els nous paquets del spa-cat és perquè estàs utilitzant o bé
>> directament des de codi o bé amb els *nightlies*. Siga com siga, si fas
>> el mateix amb apertium i lttoolbox (instal·lar des de codi o utilitzar els
>> *nightlies*) no hauries de tindre problemes.
>>
>> A Softcatalà estem utilitzant apy així i no tenim cap problema.
>>
>> Respecte als retocs que cal fer a apy perquè funcionen els modes nous, la
>> veritat és que encara no m'ho he mirat. Si veus el problema i pots fer un
>> pull request o comentar-ho per ací, seria genial.
>>
>> Gràcies!!
>> --
>> < Xavi Ivars >
>> < http://xavi.ivars.me >
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
>
> --
> Joan Moratinos
> jmorati...@gmail.com
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apy/translate falla amb lt-proc -x

2019-01-26 Thread Xavi Ivars
Missatge de Joan Moratinos Jaume  del dia ds., 26 de
gen. 2019 a les 9:42:

> Estic provant apy/translate amb el nous parells spa-cat, que inclouen
> "lt-proc -x '.../apertium-spa-cat/spa-cat.autopgen-diacritics-*.bin'". La
> primera vegada, funciona. Després, torna una traducció en blanc o la
> primera traducció, encara que els arguments siguin diferents.Eliminant
> "lt-proc -x", la cosa torna a funcionar.
> Incidentalment, cal fer alguns petits retocs perquè apy reconegui noms de
> mode com "spa-cat_valencia_iec2017".
>
>
Hola Joan,

Si lleves "lt-proc -x", la cosa no et funcionarà com cal: eixe mode fa un
pre-procés abans de la post-generació i, si no es fa, genera resultats no
desitjats precisament amb la postgeneració.

El que passa és que l'última versió estable dels paquets lttoolbox i
apertium no tenen l'última versió del "lt-proc -x". Però supose que si
estàs provant els nous paquets del spa-cat és perquè estàs utilitzant o bé
directament des de codi o bé amb els *nightlies*. Siga com siga, si fas el
mateix amb apertium i lttoolbox (instal·lar des de codi o utilitzar els
*nightlies*) no hauries de tindre problemes.

A Softcatalà estem utilitzant apy així i no tenim cap problema.

Respecte als retocs que cal fer a apy perquè funcionen els modes nous, la
veritat és que encara no m'ho he mirat. Si veus el problema i pots fer un
pull request o comentar-ho per ací, seria genial.

Gràcies!!
-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Stop merging lines

2018-11-09 Thread Xavi Ivars
What are the encodings that each of you are using in the shell? Is it a UTF
one in both cases?


--
Xavi Ivars
< http://xavi.ivars.me >

El dv., 9 de nov. 2018, 09:41, mansur <6688...@gmail.com> va escriure:

> Strange.
> I uploaded my output here: https://filebin.net/c7mikerq2vwv08ql
>
>
> Am Fr., 9. Nov. 2018 um 11:31 Uhr schrieb Kevin Brubeck Unhammer <
> unham...@fsfe.org>:
>
>> I still get 5 lines for that, could you upload the output you get too?
>> I get:
>>
>> http://sprunge.us/fJYZbm
>>
>> -Kevin
>>
>> mansur <6688000-re5jqeeqqe8avxtiumw...@public.gmane.org> čálii:
>>
>> > Hi!
>> > I uploaded it here:
>> > https://filebin.net/46e383wip8h2qcrc
>> >
>> >
>> > Am Fr., 9. Nov. 2018 um 11:00 Uhr schrieb Kevin Brubeck Unhammer <
>> > unham...@fsfe.org>:
>> >
>> >> mansur <6688000-re5jqeeqqe8avxtiumw...@public.gmane.org> čálii:
>> >>
>> >> > One more example:
>> >> >
>> >> > - Фәнис Яруллин �
>> >> > - Фәнис Яруллинга багышланган чараларның һәрберсендә катнашырга
>> тырышам,
>> >> -
>> >> > диде әдипнең дусты Мохтар Афзалов.
>> >> >
>> >> > ^-/-$ ^Фәнис/Фәнис$
>> >> > ^Яруллин/Яруллин$ �-/-$
>> >> > ^Фәнис/Фәнис$ ^Яруллинга/Яруллин$
>> >> > ^багышланган/багышла$
>> >> ^чараларның/чара$
>> >> > ^һәрберсендә/*һәрберсендә$ ^катнашырга/катнаш$
>> >> > ^тырышам/тырыш$^,/,$ ^-/-$
>> >> > ^диде/ди$ ^әдипнең/әдип$
>> >> > ^дусты/дуст$ ^Мохтар/Мохтар$
>> >> > ^Афзалов/Афзалов+и$^./.$
>> >> >
>> >> > Here it happens because of some broken char... But why?
>> >>
>> >> I can't reproduce it, but maybe the broken character didn't survive the
>> >> e-mail. Could you e.g. put a text file with it on https://filebin.net/
>> ?
>> >>
>> >> ___
>> >> Apertium-stuff mailing list
>> >> Apertium-stuff@lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>> >>
>> >
>> > ___
>> > Apertium-stuff mailing list
>> > apertium-stuff-5nwgofrqmnerv+lv9mx5uipxlwaov...@public.gmane.org
>> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>> >
>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium - new language pair

2018-11-08 Thread Xavi Ivars
Missatge de takajima kawasaki  del dia dj., 8
de nov. 2018 a les 15:12:

> Hi,
>
> May I ask on how to create a new language pair for the apertium app?
> examples like japanese , mandarin , and korean .
>

Hi,

This is a good starting point

http://wiki.apertium.org/wiki/Apertium-init

You youd probably need to bootstrap monolingual and bilingual modules, as I
don't think we have such languages available (I may be wrong).

By the way, where did you find the link you were trying to open?

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New release fra-cat

2018-10-29 Thread Xavi Ivars
The issue with apertium-separable was fixed. The problem, at least the one
I'm aware of, is of the perceptrong (and also unigram) tagger, that doesn't
accept null flush. That's the reason why trunk version of eng-cat wasn't
working. But with the change I did yesterday, it should also work now.

Apertium-fra-cat is working properly via apy in a dev version that I have
here...
https://www.softcatala.org/api/dev/traductor/translate?q=bon+dia&langpair=cat|fra
<https://www.softcatala.org/api/dev/traductor/translate?q=bon+dia&langpair=cat%7Cfrae>

I think the only thing needed would be to update everything (all packages)
in the server running the apy

Missatge de Tino Didriksen  del dia ds., 27
d’oct. 2018 a les 11:01:

> We are aware of the problem. Search for "separable" in:
> https://tinodidriksen.com/pisg/freenode/logs/%23apertium/2018-10-15.log
> https://tinodidriksen.com/pisg/freenode/logs/%23apertium/2018-10-16.log
>
> Not sure if anyone got to the bottom of it.
>
> -- Tino Didriksen
>
>
>
> On Sat, 27 Oct 2018 at 08:49, Hèctor Alòs i Font 
> wrote:
>
>> There is a problem with the French-to-Catalan translator in apertium.org.
>> It is always giving "Translation not yet available!". The opposite side is
>> working. Can someone take a look? Thanks in advance!
>> Hèctor
>>
>> Missatge de Hèctor Alòs i Font  del dia dt., 18 de
>> set. 2018 a les 20:50:
>>
>>> I have prepared a new version of apertium-fra-cat. It is basically the
>>> one that was ready in April, but that could not be released because of
>>> problems in apertium-separable. The new one incorporates a few improvements
>>> from the new French-Occitan translator and has passed testvoc.
>>>
>>> Tino, could you please package it? Thanks in advance!
>>>
>>> També es pot carregar a softcatala.org.
>>>
>>> Hèctor
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Release: lttoolbox, apertium, -lex-tools, -separable, hfst, hfst-ospell, cg3

2018-10-04 Thread Xavi Ivars
Missatge de Tino Didriksen  del dia dj., 4 d’oct.
2018 a les 13:27:

> Released:
> - lttoolbox v3.5.0
> - apertium v3.5.2
> - apertium-lex-tools v0.2.1
> - apertium-separable v0.3.2
> - hfst v3.15.0
> - hfst-ospell v0.5.0
> - cg3 v1.1.7
>
>
Thank you Tino!!!

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium eng-cat release

2018-10-04 Thread Xavi Ivars
BTW, it depends on a new release (yet to be done) of apertium-separable

Missatge de Xavi Ivars  del dia dj., 4 d’oct. 2018 a
les 11:41:

> Tino, this one is also still pending.
>
> Missatge de Xavi Ivars  del dia dc., 30 de maig
> 2018 a les 21:06:
>
>>
>>
>> 2018-05-29 16:55 GMT+02:00 Marc Riera Irigoyen <
>> marc.riera.irigo...@gmail.com>:
>>
>>> Hello,
>>>
>>> Seeing that another pair depending on apertium-cat (apertium-spa-cat) is
>>> getting a release, I think it is a good idea to finally release the new
>>> version of apertium-eng-cat. The bug in apertium-separable which also
>>> affected apertium-fra-cat is now fixed (as far as I know), and the release
>>> has been kept on hold for quite a long time already. Given that the pair is
>>> clean with the latest commit in the apertium-cat repository, I think it
>>> makes sense to push it forward now.
>>>
>>>
>> In order to do this, a new release of lttoolbox and a new release of
>> apertium-separable has to be released as well.
>>
>> Tino, is there anything we can do to help you with the release?
>>
>>
>> --
>> < Xavi Ivars >
>> < http://xavi.ivars.me >
>>
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium eng-cat release

2018-10-04 Thread Xavi Ivars
Tino, this one is also still pending.

Missatge de Xavi Ivars  del dia dc., 30 de maig 2018
a les 21:06:

>
>
> 2018-05-29 16:55 GMT+02:00 Marc Riera Irigoyen <
> marc.riera.irigo...@gmail.com>:
>
>> Hello,
>>
>> Seeing that another pair depending on apertium-cat (apertium-spa-cat) is
>> getting a release, I think it is a good idea to finally release the new
>> version of apertium-eng-cat. The bug in apertium-separable which also
>> affected apertium-fra-cat is now fixed (as far as I know), and the release
>> has been kept on hold for quite a long time already. Given that the pair is
>> clean with the latest commit in the apertium-cat repository, I think it
>> makes sense to push it forward now.
>>
>>
> In order to do this, a new release of lttoolbox and a new release of
> apertium-separable has to be released as well.
>
> Tino, is there anything we can do to help you with the release?
>
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New versions of Apertium

2018-10-03 Thread Xavi Ivars
Missatge de Francis Tyers  del dia dt., 4 de set. 2018
a les 12:22:

> El 2018-09-04 11:07, Tino Didriksen escribió:
> >
> > What I want a sign-off for is something like: is this feature
> > complete? Was this all the weight stuff that was needed for now? Any
> > critically missing bits that makes this not releasable just yet?
>
> It could always use more testing, but I think it's working fine per
> the specifications.


Is there anything pending to release both lttoolbox and apertium now?

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Colaboración con el paquete apertium-es-gl

2018-09-25 Thread Xavi Ivars
Hola Enrique!

Quería escribirte desde hace algunos dias, pero he estado de vacaciones. En
primer lugar, bienvenido a Apertium!!!

Soy Xavi Ivars, uno de los colaboradores de Apertium, principalmente en los
pares relacionados con el catalán, pero también contribuyeno como puedo en
el *core *del proyecto. A parte de Apertium, formo parte desde hace años de
Softcatalà, donde (entre otras cosas) nos encargamos de traducir al catalán
los productos del proyecto Mozilla. Estoy seguro que conoces a Jordi
Serratosa, a Toni Hermoso o a Joan Montané.

He visto los cambios que has hecho al par es-gl, y es genial que un par
"fundador" de Apertium reciba cambios "de la comunidad" tantos años
después, y sobre todo, que vaya encaminado a mejorar la mantenibilidad a
medio y largo plazo del mismo.

Sólo hay una cosa que, personalmente, te recomendaría si tienes intención
de contribuir a largo plazo en el traductor. Desde hace algunos años, el
"modelo" de Apertium cambió ligeramente: de tener en un mismo paquete los
diccionarios monolingües y el bilingue (en tu caso, apertium-es-gl.es.dix,
apertium-es-gl.gl.dix i apertium-es-gl.es-gl.dix), los paquetes de
"traductores" ahora sólo tienen el diccionari bilingüe, mientras que
existen nuevos "modules" completamente monolingües (apertium-spa,
apertium-cat, apertium-eng).

De esta manera, al añadir palabras, elimiar duplicados, corregir
paradigmas... se hace de manera global para todos los pares de idioma que
incluyen ese idioma. El cambio es un poco engorroso, pero sirve para añadir
muchas correcciones ya hechas en otros paquetes, y para estar "a la
última", aprovechando al máximo el trabajo de toda la comunidad.

Como ejemplo, hace un par de años hice el cambio en el par catalan-español,
y ahora mismo el monolingüe catalán se usa para los pares spa-cat, fra-cat,
eng-cat y ron-cat. Vale la pena, de verdad :)

Si te interesa, hay una tarea en github para eso [1]. La documentación no
es demasiado extensa, pero un montón de gente de la comunidad puede
ayudarte a hacerlo.

Un saludo, y bienvenido de nuevo!

[1] https://github.com/apertium/apertium-es-gl/issues/1


Missatge de Enrique Estévez Fernández  del dia dt., 18 de set.
2018 a les 12:38:

> Hola.
>
> Soy Enrique Estévez, más conocido por Keko en el mundillo del software
> libre. Soy el coordinador del equipo de localización al gallego de los
> productos de Mozilla. En estes momentos, soy el coordinador de la Oficina
> de Software Libre (OSL) del CIXUG, el CIXUG, es un consorcio que está
> formado por las tres universidades gallegas.
>
> En esta OSL, ya ha habido tres coordiandores desde su creación, hace ya
> más de cinco años. Los dos coordinadores que estuvieron antes, se pusieron
> en contacto, que yo sepa con Mikel Forcada, para ver de integrar unas
> mejoras que hicieron en el paquete, pero debido a que no estaba en el
> formato adecuado o que había errores, al final no subieron los cambios.
>
> Yo he retomado el trabajo y estoy haciendo pruebas en local. Primero he
> echo pruebas con un pequeño script y mediante la hoja de cálculo de
> LibreOffice. Hoy he descubierto el programa apertium-dixtools, y estoy
> haciendo pruebas. A ver si con los dos métodos llego al mismo resultado.
> Las pruebas que estoy haciendo son para cambiarle el formato a nuestras
> modificaciones locales de los dicionarios, para llevarlas al formato de los
> dicionarios existentes en el repositorio apertium-es-gl, y para fusionar
> ambos diccionarios, y así poder integrar nuestros cambios, y que aparezcan
> identificados.
>
> Si os parece bien, necesitaría permisos de commit en el repositorio
> apertium-es-gl, o sino, haré un fork del repositorio, e irá haciendo
> cambios, hasta que finalmente solicite una pull-request. Mi usuario en
> github es keko (https://github.com/keko)
>
> Mi idea es la siguiente:
> - Ordenar los términos o palabras en los diccionarios del paquete oficial
> (no sé bien que criterio escoger, si alfabeticamente por las entradas, o si
> por las categorías de las entradas (es decir, por nombres, por adjetivos,
> por verbos, ... y dentro de cada categoría, alfabeticamente) y luego
> alfabeticamente por las palabras.
> - Comprobar si hay entradas duplicadas, si es así, eliminarlas
> - A continuación añadir nuestras modificaciones, es decir, las palabras
> que hemos metido en los diccionarios
>
> Cualquiera sugerencia o ayuda se agradece.
>
> Un saludo.
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New versions of Apertium

2018-09-04 Thread Xavi Ivars
Missatge de Tino Didriksen  del dia dt., 4 de
set. 2018 a les 11:08:

>
>
> Merging to master is simply a case of "does it build and pass tests", imo.
>
> What I want a sign-off for is something like: is this feature complete?
> Was this all the weight stuff that was needed for now? Any critically
> missing bits that makes this not releasable just yet?
>

I see what you mean, but if the answer for this sign-off was "it's not
ready", it could potentially be blocking releasing other complete features,
so I personally wouldn't merge to master then. But we can have this
conversation later on :)

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New versions of Apertium

2018-09-04 Thread Xavi Ivars
Missatge de Tino Didriksen  del dia dt., 4 de
set. 2018 a les 10:58:

> Releasing new versions is most certainly the plan. Basically just need
> someone (e.g. the relevant mentors) to sign off on the weight code.
>

I thought that sign off was already done, as the code was merged to master.

I guess now it's OK, but in the future we should avoid merging code that we
think needs something else before a release happens.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] New versions of Apertium

2018-09-04 Thread Xavi Ivars
Hi,

I wanted to suggest that we release new versions of both lttoolbox and
Apertium.

   - lttoolbox includes a couple of interesting new features:
  - weights & headers for transducers,
  - a new mode that can be used between gen and postgen
   - apertium includes support for weights, but also some critical fixes
   (C++ version agnostic usage of HMM tagger)


After that, we may need to publish a bunch of new languages and pairs that
will sue the new features

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Issues with apertium-tagger

2018-08-21 Thread Xavi Ivars
Found out what the problem was. It's REALLY weird, but documented in
several places, including the standard ([1], [2])

The problem is with this piece of code

#include
int main() {
  std::map m;
  m[0] = m.size();
}


It seems that in gcc 6.x.x, m[1] got the value of 1. The reason for that is
the left size of the assignment is evaluated first to obtain a reference,
and as [] creates a new element if it doesn't exist. So when m.size() is
evaluated, the size is actually 1.

Clang, on the other hand, was evaluating first m.size(), getting 0, and
then assigning that as the value of m[0].

With gcc 7 [3], finally a change part of C++2017 called "Refining
Expression Evaluation Order for Idiomatic C++" got supported [4]. And it
seems that this brings the clang behavior to the old C++, which basically
breaks apertium-tagger here [5]

index[t] = index.size()-1;

I've added a proposal for a fix, by keeping the way Collection.cc was
implemented but without relying on any specific version of C++.

int position = index.size();
index[t] = position;

We could also fix it doing things like

#if __GNUC__ >= 7
index[t] = index.size();
#else
index[t] = index.size()-1;
#endif

or with the __cplusplus macro instead. But I personally think the proposed
fix is better.

[1] http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4228.pdf
[2]
https://blog.jayway.com/2015/09/08/undefined-behaviour-in-c-when-adding-to-map/
[3]
https://www.bfilipek.com/2017/12/cpp-status-2017.html#compiler-support-for-c17
[4] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf
[5]
https://github.com/apertium/apertium/blob/master/apertium/collection.cc#L48

Missatge de Xavi Ivars  del dia ds., 18 d’ag. 2018 a
les 20:19:

> Ok, quite a lot of more (useful) information regarding this.
>
> First of all, I created a branch that prints a lot of debug information
> (probabilities, etc), that is only be useful for this specific
> investigation. It'd be worth, though, do it properly and keep some of that
> information for the existing debug mode.
>
> https://github.com/apertium/apertium/tree/logging-hmm
>
> Now, the data.
>
> In a machine with Debian stretch, that works properly:
>
> echo '^cumbre/cumbre$ ^en/en$
> ^Madrid/Madrid/Madrid$^./.$' |
> apertium/apertium-tagger -gdmf /src/apertium-spa-cat/spa-cat.prob
> WORD = ({NOMF} Word: cumbre) TAGSET: 22,4 - Prob: 0.0323806
> ^cumbre/cumbre$END: Word: {NOMF} Word: cumbre
> WORD = ({PREP} Word: en) TAGSET: 43,22 - Prob: 0.23551
>  ^en/en$END: Word: {PREP} Word: en
> WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 36,43 - Prob:
> 0.000393184
> WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 37,43 - Prob:
> 0.000966503
> END: Word: {ANTROPONIM,TOPONIM} Word: Madrid
> WORD = ({TAG_SENT} Word: .) TAGSET: 4,36 - Prob: 6.08016e-05
> WORD = ({TAG_SENT} Word: .) TAGSET: 4,37 - Prob: 0.000192669
>  ^=Madrid/Madrid/Madrid$^./.$END: Word: {TAG_SENT} 
> Word:
> .
> WORD = ({TAG_kEOF} Word: ) TAGSET: 5,4 - Prob: 0.00133422
>
>
> In this case, tag 43 is PREP, tag 36 is ANTROPONIM (np.ant), tag 37 is
> np.loc (TOPONIM). and tag 4 is SENT. We can see in the first yellow line
> that the probability of prep + np.loc is 3x the probability of prep +
> np.ant.  Similarly, np.loc + sent is quite higher than np.ant + sent.
>
> Overall, this makes apertium-tagger choice an easy one: np.loc over np.ant
>
> Now, same results in a machine running Ubuntu 18.04 (bionic). Just to make
> sure, both machines are running latest lttoolbox (from nighlty package),
> with latest apertium-tagger (from code), with same probability file.
>
> $ echo '^cumbre/cumbre$ ^en/en$
> ^Madrid/Madrid/Madrid$^./.$' |
> apertium/apertium-tagger -gdmf ~/src/apertium/apertium-spa-cat/spa-cat.prob
> WORD = ({NOMF} Word: cumbre) TAGSET: 22,4 - Prob: 4.53636e-12
> ^cumbre/cumbre$END: Word: {NOMF} Word: cumbre
> WORD = ({PREP} Word: en) TAGSET: 43,22 - Prob: 2.56179e-11
>  ^en/en$END: Word: {PREP} Word: en
> WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 36,43 - Prob:
> 0.000561191
> WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 37,43 - Prob:
> 5.13191e-12
> END: Word: {ANTROPONIM,TOPONIM} Word: Madrid
> WORD = ({TAG_SENT} Word: .) TAGSET: 4,36 - Prob: 8.67821e-15
> WORD = ({TAG_SENT} Word: .) TAGSET: 4,37 - Prob: 1.02303e-22
>  ^=Madrid/Madrid/Madrid$^./.$END: Word: {TAG_SENT} 
> Word:
> .
> WORD = ({TAG_kEOF} Word: ) TAGSET: 5,4 - Prob: 1.33422e-13
>
>
> I've highlighted in this case the rows that make the tagger prefer np.ant
> instead of np.loc. Probabilities arehigher, so the decission is also clear.
> But it is very weird that proabilities are different with the same input
> and the same .prob file. And not only for this, we can see the same thing
> for every single 

Re: [Apertium-stuff] Issues with apertium-tagger

2018-08-18 Thread Xavi Ivars
Ok, quite a lot of more (useful) information regarding this.

First of all, I created a branch that prints a lot of debug information
(probabilities, etc), that is only be useful for this specific
investigation. It'd be worth, though, do it properly and keep some of that
information for the existing debug mode.

https://github.com/apertium/apertium/tree/logging-hmm

Now, the data.

In a machine with Debian stretch, that works properly:

echo '^cumbre/cumbre$ ^en/en$
^Madrid/Madrid/Madrid$^./.$' |
apertium/apertium-tagger -gdmf /src/apertium-spa-cat/spa-cat.prob
WORD = ({NOMF} Word: cumbre) TAGSET: 22,4 - Prob: 0.0323806
^cumbre/cumbre$END: Word: {NOMF} Word: cumbre
WORD = ({PREP} Word: en) TAGSET: 43,22 - Prob: 0.23551
 ^en/en$END: Word: {PREP} Word: en
WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 36,43 - Prob: 0.000393184
WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 37,43 - Prob: 0.000966503
END: Word: {ANTROPONIM,TOPONIM} Word: Madrid
WORD = ({TAG_SENT} Word: .) TAGSET: 4,36 - Prob: 6.08016e-05
WORD = ({TAG_SENT} Word: .) TAGSET: 4,37 - Prob: 0.000192669
 ^=Madrid/Madrid/Madrid$^./.$END: Word:
{TAG_SENT} Word:
.
WORD = ({TAG_kEOF} Word: ) TAGSET: 5,4 - Prob: 0.00133422


In this case, tag 43 is PREP, tag 36 is ANTROPONIM (np.ant), tag 37 is
np.loc (TOPONIM). and tag 4 is SENT. We can see in the first yellow line
that the probability of prep + np.loc is 3x the probability of prep +
np.ant.  Similarly, np.loc + sent is quite higher than np.ant + sent.

Overall, this makes apertium-tagger choice an easy one: np.loc over np.ant

Now, same results in a machine running Ubuntu 18.04 (bionic). Just to make
sure, both machines are running latest lttoolbox (from nighlty package),
with latest apertium-tagger (from code), with same probability file.

$ echo '^cumbre/cumbre$ ^en/en$
^Madrid/Madrid/Madrid$^./.$' |
apertium/apertium-tagger -gdmf ~/src/apertium/apertium-spa-cat/spa-cat.prob
WORD = ({NOMF} Word: cumbre) TAGSET: 22,4 - Prob: 4.53636e-12
^cumbre/cumbre$END: Word: {NOMF} Word: cumbre
WORD = ({PREP} Word: en) TAGSET: 43,22 - Prob: 2.56179e-11
 ^en/en$END: Word: {PREP} Word: en
WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 36,43 - Prob: 0.000561191
WORD = ({ANTROPONIM,TOPONIM} Word: Madrid) TAGSET: 37,43 - Prob: 5.13191e-12
END: Word: {ANTROPONIM,TOPONIM} Word: Madrid
WORD = ({TAG_SENT} Word: .) TAGSET: 4,36 - Prob: 8.67821e-15
WORD = ({TAG_SENT} Word: .) TAGSET: 4,37 - Prob: 1.02303e-22
 ^=Madrid/Madrid/Madrid$^./.$END: Word:
{TAG_SENT} Word:
.
WORD = ({TAG_kEOF} Word: ) TAGSET: 5,4 - Prob: 1.33422e-13


I've highlighted in this case the rows that make the tagger prefer np.ant
instead of np.loc. Probabilities arehigher, so the decission is also clear.
But it is very weird that proabilities are different with the same input
and the same .prob file. And not only for this, we can see the same thing
for every single probability computed by the tagger: *all of them are
different.*

 Not sure how we should proceed about this, but IMHO is quite concerning to
have this type of inestability (can we call it "bug"?😊) in the core of
apertium's pipeline.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Issues with apertium-tagger

2018-08-18 Thread Xavi Ivars
In Debian Jessie, it alsow works properly

Missatge de Xavi Ivars  del dia ds., 18 d’ag. 2018 a
les 18:50:

>
>
> Missatge de Tino Didriksen  del dia dc., 8
> d’ag. 2018 a les 10:42:
>
>>
>> The lttoolbox segfault is now fixed and passes its own "make test". I'll
>> investigate tagger further.
>>
>
> We've digged more into it, and nailed down to something that could be the
> culprit (even if it doesn't really makes sense): the version of
> Debian/Ubuntu.
>
> Basically, we haven't managed to reproduce the error in any installation
> using xenial/stretch. I really didn't matter if we were using
> apertium-nightly, or compiling from code. And if nightly, it didn't matter
> either if we were using latest version, or a month-old version.
>
> On the other hand, we haven't managed to get a correct result using
> bionic. The results are always wrong, using nightly packages or compiling
> manually source code.
>
> --
> < Xavi Ivars >
> < http://xavi.ivars.me >
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Issues with apertium-tagger

2018-08-18 Thread Xavi Ivars
Missatge de Tino Didriksen  del dia dc., 8 d’ag.
2018 a les 10:42:

>
> The lttoolbox segfault is now fixed and passes its own "make test". I'll
> investigate tagger further.
>

We've digged more into it, and nailed down to something that could be the
culprit (even if it doesn't really makes sense): the version of
Debian/Ubuntu.

Basically, we haven't managed to reproduce the error in any installation
using xenial/stretch. I really didn't matter if we were using
apertium-nightly, or compiling from code. And if nightly, it didn't matter
either if we were using latest version, or a month-old version.

On the other hand, we haven't managed to get a correct result using bionic.
The results are always wrong, using nightly packages or compiling manually
source code.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] New lt-proc mode

2018-08-16 Thread Xavi Ivars
Hi all,

I've added a new mode for lt-proc -x, named (for now) inter-generation -
better name and flag are more than welcome :)

You can find the new feature in this branch (waiting for another pair of
eyes to make sure it works)

https://github.com/apertium/lttoolbox/pull/29/

It's meant to act on same type of FST than the postgen modules, with some
key differences:

   - It doesn't remove all ~, only the ones that matched
   - It processes fully the characters found on the left part
  - post-generation mode is meant to act on strings like ~wordn,
  where word changes based on the "n" character coming after the word.
  Because of that, it moves processing back and n gets reprocessed.
  - inter-generation, on the other hand, consumes all characters
  matched on the input string. That gives some benefits, like being able to
  change a string at the end of the input, without the need of
having a blank
  + another word after that. It also adds some constraints, like limiting
  changes to words themselves.


Overall, this new mode allows the following scenario, with multiple
inter-generation steps (similar to the inter-chunk/post-chunk in the tagger)

... | lt-proc -g lang.autogen.bin | lt-proc -x lang.autointergen.1.bin |
lt-proc -x lang.autointergen.2.bin | ... | lt-proc -x
lang.autointergen.n.bin | lt-proc -p lang.autopgen

Let me know if you have any feedback/suggestion/question/...

Thanks!
-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Issues with apertium-tagger

2018-08-08 Thread Xavi Ivars
Missatge de Abinash Senapati  del dia dt., 7 d’ag.
2018 a les 9:04:

> The only problem we have now is segmentation fault for pre-compiled
> binaries. If the language pair is building fine I don't think any
> inconsistencies will arise because of the addition of weights.
>

If that is not the root cause, we need to find out what the root cause is.
But having two versions of the tagger that, with exactly the same input and
the same prob files does not generate the same output is really bad.

I double checked, and th ecoarse tags for the tagger do distinguish between
both input words (np.loc and np.ant). Is there anyone that can help a bit
on debugging apertium-tagger? Things like displaying the probabilities for
each option would help us identify if this is just something that is a
coincidence (both tags have exactly the same probs) or if there's something
wrong.



-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Issues with apertium-tagger

2018-08-06 Thread Xavi Ivars
Missatge de Francis Tyers  del dia dt., 7 d’ag. 2018 a
les 0:13:

>
> I'd say that's feasible as it will have changed the topology of some
> transducers. Although in principle if the tagger is doing its job
> as expected, that shouldn't happen ... e.g. it should only happen if
> there is ambiguity left.
>
>
Is there any way to check if there is ambiguity left? Something like
printing the distinct probabilities of the different analysis?

I've been looking through the code, but I can't see any way of properly
debugging this. Ideas?


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Implementation of weights in lttoolbox

2018-08-06 Thread Xavi Ivars
Missatge de Flammie  del dia dg., 5 d’ag. 2018 a les 14:54:

>
> [...]
>
> There is a -signed weight system in e.g. openfst that was added exactly
> for this intuition of weights, so it operates on R-, min, - instead of
> R+, max, +, maybe it could be an ok solution?
>
>
You mean still having "bigger weights, worse" but using negative numbers
(which, in this case would be that "smaller abs(x)" would be worse (as
they're actually bigger), but then we would be consistent with lexical
selection?

That would work, I guess. In any case, as long as there's a reason to do it
the way it is now, I'm OK. It was more an "out of curiosity" question, and
some bias towards consistency that I have, than a real concern.

We just need to make sure that this is properly documented.

Thanks!

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Issues with apertium-tagger

2018-08-06 Thread Xavi Ivars
Hi,

We've recently seen some issues happening with apertium tagger and recent
nightlies.

This is what happens with nightly

Version: 3.5.1+g697~a60e0bc0-1~stretch1 (few days ago)

$ echo '^cumbre/cumbre$ ^en/en$
> ^Madrid/Madrid/Madrid$^./.$' | apertium-tagger -gdm
> /src/apertium-spa-cat/spa-cat.prob
> ^cumbre$ ^en$ ^=Madrid$^.$


On the other hand, this is what happens with  Version:
3.5.1+g702~9d8ad1a2-1~bionic1

$ echo '^cumbre/cumbre$ ^en/en$
> ^Madrid/Madrid/Madrid$^./.$' | apertium-tagger -gdm
> /src/apertium-spa-cat/spa-cat.prob
> ^cumbre$ ^en$ ^=Madrid$^.$


In both cases, the .prob file is the same

$ md5sum ./spa-cat.prob
> aaf24085338f39f9133b65ca73de71f9  ./spa-cat.prob


We've managed to reproduce the unstability in different environments, we
cannot understand why the tagger is not consistent.

Is it possible that the recent changes related to weighted FSTs has any
impact on it?

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Implementation of weights in lttoolbox

2018-08-04 Thread Xavi Ivars
This is great!

However, is a bit counter-intuitive that lower weight wins. In other cases
(like lexical selection rules), higher weights win.

Would it be possible to do something so weights work consistently?

Missatge de Francis Tyers  del dia ds., 4 d’ag. 2018 a
les 17:53:

> El 2018-08-03 15:42, Abinash Senapati escribió:
> > _
> > Hello developers,
> > I am a student currently working on the idea EXTEND LTTOOLBOX TO HAVE
> > THE POWER OF HFST for my GSoC project. So, I am here talk about the
> > new modifications that are now a part of the lttoolbox and want all of
> > you to try them out. As a part of my Coding Challenge I have developed
> > a module that converts the LEXC_ files to the _dix _file format. The
> > repo for the package is https://github.com/Techievena/lexc2dix. So
> > these are the set of changes we have in lttoolbox right now.
> >
> > Currently lttoolbox supports allows weights in the binary files. Here
> > is a snippet of that.
>
> Thanks Abinash! Excellent work!
>
> What this means is that you can now weight your morphological analysers,
> generators and bilingual dictionaries.
>
> Here are some problems that can solve:
>
> 1) Having zero-context rules in your .lrx files. Now you can just put
> the
> weights directly in your bilingual dictionary
>
> $ echo "^estación$" | lt-proc -W -b testbidix.bin
> ^estación/season/station$
>
> $ echo "^estación$" | lt-proc -b testbidix.bin
> ^estación/season/station$
>
> Analyses will be output according to lowest weight first. So you can
> mark your
> default translation as "1.0" and then all others as >1.0 ... because of
> how
> transfer works, it will always take the first, which will be the one
> with
> the lowest weight.
>
> 2) Improving POS-tagging accuracy by ordering analyses by probability.
> This
> way if your CG doesn't mop up all the ambiguity, you will get the
> best
> remaining analysis. This works kind of like the unigram tagger, but
> because
> it can be in the analyser itself, it can be easier to control.
>
> 3) Dealing with non-standard forms, instead of having to use LR/RL
> direction
> restrictions, you can just make non-standard forms have a high weight
> and
> ask for lt-proc to only generate the surface form with the lowest
> weight.
>
> There will no doubt be even more fun stuff that we can do with weights.
> I
> for one think it's very exciting and would encourage people to play
> around
> with it and see what they can come up with.
>
> Fran
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Asking Apertium not to tranlate farther when deformating

2018-06-11 Thread Xavi Ivars
Missatge de Chardonneau Bernard  del
dia dl., 11 de juny 2018 a les 9:34:

>
>
> Yes, if no tags was provided for that during apertium design, a end of
> sentence . can be a solution. But to avoid extra upercase characters,
> a , seems to work better.
>
> So, I think to add ,[] sequences at the end of lines not finissing by
> a " and taking them off when reformatting.
>

I guess this will work in most of the cases. But I can imagine some rules
may apply accross ",".

Probably you would achieve the same result by adding an extra end of line
at the end of each end of line. So, instead of

LA  la
FLEUR   fleur
ROUGE   rouge

something like

LA  la

FLEUR   fleur

ROUGE   rouge

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Asking Apertium not to tranlate farther when deformating

2018-06-10 Thread Xavi Ivars
I'm assuming you're using a custom deformatter, right?

If that's the case, you could make the deformatter force an "end of
sentence" for each line. That would solve your problem.

Missatge de Chardonneau Bernard  del
dia dv., 8 de juny 2018 a les 9:03:

> Hello,
>
> First, my email bechapert...@free.fr is still (and will stay) valid, but
> as sourceforge seems to refuse any mails from several big providers,
> I use another one to write here.
>
> I want to translate only the right part of each line of a file.
> Here is a stupid example of what is done when using " "
>
> Original
> LA  "la"
> FLEUR   "fleur"
> ROUGE   "rouge"
>
> Translated
> LA  "The"
> FLEUR   "Flower"
> ROUGE   "Red"
>
> except for upper/lower case, it works.
>
> But without the " "
>
> Original
> LA  la
> FLEUR   fleur
> ROUGE   rouge
>
> Deformatting
> [LA  ]la[
> FLEUR   ]fleur[
> ROUGE   ]rouge[
> ][]
>
> Translated
> LA  The
> FLEUR   red
> ROUGE   flower
>
> I would like to avoid interaction between two following lines.
>
> As sometimes using apertium-destxt, we get several [] I tried it but it
> does not change.
>
> Is there an official way to tell Apertium that what is before a particular
> tag must not interact with what is after ?
>
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium eng-cat release

2018-05-30 Thread Xavi Ivars
2018-05-29 16:55 GMT+02:00 Marc Riera Irigoyen <
marc.riera.irigo...@gmail.com>:

> Hello,
>
> Seeing that another pair depending on apertium-cat (apertium-spa-cat) is
> getting a release, I think it is a good idea to finally release the new
> version of apertium-eng-cat. The bug in apertium-separable which also
> affected apertium-fra-cat is now fixed (as far as I know), and the release
> has been kept on hold for quite a long time already. Given that the pair is
> clean with the latest commit in the apertium-cat repository, I think it
> makes sense to push it forward now.
>
>
In order to do this, a new release of lttoolbox and a new release of
apertium-separable has to be released as well.

Tino, is there anything we can do to help you with the release?


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] [Apertium-catala] publicar una nova versió del parell spa-cat

2018-05-30 Thread Xavi Ivars
Tino, can you publish a new version of apertium-cat, apertium-spa, and
apertium-spa-cat?

El dia 29 de maig de 2018 a les 14:58, Francis Tyers 
ha escrit:

> El 2018-05-29 13:02, Jaume Ortolà i Font escribió:
>
>> Bon dia,
>>
>> Durant els últims mesos hem fet moltes millores en el parell spa-cat.
>> Darrerament hem fet també molts tests, i hem arribat a un punt d'una
>> certa estabilitat.
>>
>> Crec que ja es podria publicar una nova versió oficial, de manera que
>> es puguen aprofitar les millores en diferents plataformes (per
>> exemple, en el traductor de la Wikipedia). Qui ho hauria de fer,
>> això?
>>
>>
> Tino,
>
> Could you publish a new version of spa-cat ?
>
> Thanks! :)
>
> Fran
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Apertium-catala mailing list
> apertium-cat...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-catala
>



-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Apertium-deswxml

2018-05-15 Thread Xavi Ivars
2018-05-15 18:54 GMT+02:00 Daniel Torregrosa :

> Hi all,
>
> I have a couple of questions about the apertium formatters.
>
> Is there a way to download and compile the formatters without downloading
> the full apertium core and lttolboox?
>
> Also, I need to translate word documents keeping the format. I think it is
> possible to unzip the docx file, deswxml word/document.xml, translate
> (assuming the superblanks don't get moved or mangled), rewxml and repack
> the zip, but I fear this can be very unrobust and fail for any reason. Do
> any of you have experience with this?
>
>
Regarding your second point, something we did in the past (which wasn't
"exactly" keeping the format was to use LibreOffice's API to convert the
docx file to ODT, and then translate that file. Not sure if that will help,
though...

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] New instance of Apertium online

2018-05-07 Thread Xavi Ivars
Hi,

The *Generalitat Valenciana* (the Valencian govenment) has recently
presented their updated SALT portal. SALT was previously a proprietary
translator/spell-checker/grammar-checker, that was used internally by the
Government and distributed to other public institutions (local governents,
etc).

Last week, they presented a new version of the portal, with a brand new
translator and spell/grammar checking experiences. And good news!
Everything based on free/open-source software. They're using Apertium [1]
to power the translator, and LanguageTool to power the spell checker.

They've been contributing back all the improvements they've done to the
dictionaries and rules: directly, by Donís - member of this list - , and
indirectly, through Prompsit and Jaume Ortolà - thanks a lot for working so
hard to improve the language pair! And even if they are not forced by the
licene, they are also acknowledging the technology powering their new
portal.

IMHO, this is a very good example of use of free/open-source software by a
public administration, and a clear *win* for Apertium as a project.

[1] http://www.salt.gva.es/

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-oci

2018-03-29 Thread Xavi Ivars
El dia 29 de març de 2018 a les 15:19, Hèctor Alòs i Font <
hectora...@gmail.com> ha escrit:

> Amb en Claudi Balaguer ens estem mirant l'apertium-oci en vistes a adoptar
> el parell francès-occità en el marc del GSoC. El projecte seria fer aquest
> parell en occità referencial, però que sigui obert a afegir altres variants
> de l'occità (en particular, l'aranès, el gascó comú, etc.).
>
> A mi se'm plateja un dubte inicial: Per què en el diccionari monolingüe
> occità s'adopta una concepció restrictiva de les varietats? En el
> diccionari català, qualsevol forma amb l'etiqueta "val" és analitzada com a
> possible si es fa servir la varietat "cat": s'entén "cante", però es genera
> sempre "canto". Ara mateix, en el diccionari occità una forma etiquetada
> com a alt="oci@aran" no està exclosa de la varietat "oci", excepte que es
> defineixi explícitament. Això causa que hi hagi una multiplicitat de
> definicions al diccionari que semblen innecessàries si s'adoptés
> l'estratègia "catalana":
>
>   emocionau n="generau__adj"/>
> emocionau n="generau__adj"/>
> emocionau n="generau__adj"/>
>
>
> Hi ha alguna raó que faci preferible en el cas de l'occità seguir aquesta
> convenció? No seria millor adoptar la manera "catalana"? Hi ha massa
> variabilitat i ambigüitat morfològica o lèxica en occità per fer-ho?
>
>
Segurament Gema podrà respondre't millor a això, però crec recordar dels
temps del català-occità i castellà-occità que eixe era exactament el
problema: hi ha tanta varietat que, en una majoria de casos, resultava
contraproduent analitzar totes les paraules, i era millor simplement
ignorar-les (a no ser que l'anàlisi fos explícita.

Respecte a oci@aran i oci@gascon, hi ha alguna diferència ara mateix als
diccionaris? Occità-català i occità-castellà només tenen oci i oci@aran,
pel que jo suposaria que oci@gascon és, simplement, el mateix que oci@aran
(tret que algú haja afegit diferències explícites recentment).


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Gsoc18 proposal

2018-03-25 Thread Xavi Ivars
2018-03-25 15:03 GMT+02:00 Suraj Tamgale :

> hi Xavi,
> concerning your review i have modified my proposal and timeline
> both.plasecheck once and i would be very happy if you will suggest some
> changes.
> link to my draft is  https://docs.google.com/document/d/11uI3WsQ5_
> L4HEyETqqgmzXTsR7fUSCd01fKe8VhekXM/edit
>



Hi Suraj,

Please send your emails to the list, not individually to possible mentors
that reply ;)

Regarding your proposal, a few comments:


   - We have some tips for GSoC applications [1], one of them being a
   template for your proposal that will allow us to better align it to what
   we'd like from students. You may want to look into it.
   - Your proposal should include information about how you plan to do your
   job. Explaining how RBMT (section 1.2) or Apertium (section 1.3) work isn't
   really needed. We'd like to see more examples of what you've already done,
   instead.
   - In the section 2.1, you say things like "*here we have collected the
   most common Sanskrit words and their English term and made the aligned
   parallel multilingual corpus*".
  - Where is *here*?
  - Who is *we*?
   - In the section 2.2, you say *We have used Penn Treebank part of speech
   (POS) tags[7,10-11], eg. {Noun, Pronoun,Adjective, Verb, Adverb,
   Preposition, Conjunction,Interjection, Auxiliary, Determiner}={NN,PP, JJ,
   VB, RB, IN, CC, UH, AUX, DT }. In slash notation , a sentence is like
   this:I/PPneed/VBa/DTpen/NN. *
  - Again, who is *we*?
  - Why have you added screenshots of some papers into your proposal?
  - Why do you include snippets of code using NTLK module?


Again, as I mentioned in my previous feedback, it seems to me that you
haven't spend time looking really into what Apertium is. If you want to
better understand it, you should try to do the coding challenge requested
for your project ([2] or [3]), see what the problems are, and come up with
a realistic timeline for you projecte. Saying "*Addition of multiwords so
to reach quality of developed pair*" doesn't really mean anything: how many
multiwords do you plan to add? How will adding those multiwords improve the
coverage and precision of the language pair? What is, in your opinion,
"Documentation and example code"?

Please come to the IRC [4] if you need any help. Many people in the IRC
channel will be happy to answer you specific questions you may have.

Regards,


[1]  http://wiki.apertium.org/wiki/Top_tips_for_GSOC_applications
[2]
http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Make_a_language_pair_state-of-the-art
[3]
http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Adopt_a_language_pair
[4] http://wiki.apertium.org/wiki/IRC



-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] (no subject)

2018-03-24 Thread Xavi Ivars
Hi Suraj,

You know that Apertium is a rule based machine translation platform and not
a neural network based one, right? Because even if in your proposal you
start defining how Apertium works, most of your proposal is basically an
overview on how Neural Machine Translation works. And the section where you
speak about NMT is called "my basic module".

Also, based on your proposed weekly schedule, I didn't understand what your
plan is: all week details are very vague and no specific points are
mentioned (design module, implement module, study algorithms, fully
integration of algorithms,... are examples of "things" you plan to do.

Could you try to be a bit more concrete on what you plan to do?

--
Xavi Ivars
< http://xavi.ivars.me >

El ds., 24 de març 2018, 23:00, Suraj Tamgale  va
escriure:

> hi everyone,
> i've submitted the proposal.i will be extremely happy if someone take a
> look at it and give review.
>
> link to my draft: h
> https://docs.google.com/document/d/11uI3WsQ5_L4HEyETqqgmzXTsR7fUSCd01fKe8VhekXM/edit?usp=sharing
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] apertium-spa does not compile

2018-03-08 Thread Xavi Ivars
2018-03-08 10:54 GMT+01:00 Francis Tyers :

> El 2018-03-08 09:05, Mikel L. Forcada escribió:
>
>> Dear all,
>>
>> I have just tried to compile one of the language pairs migrated to
>> GitHub (by the way, thanks a million everyone who migrated things!
>> Lots of hard work!).
>>
>> Apertium-spa does not compile because someone called a="GVA" added
>> entries in error.
>>
>
> Xavi fixed it now.
>
> Now that we are in GitHub, there is no excuse whatsoever for pushing
>> content that does not compile. One can always locally commit and only
>> push when things nicely compile.
>>
>>
> Exactly! Or they can work in branches and merge when it compiles.
>

Something that doesn't even compile should never get pushed (IMHO), but one
good thing is that we can setup now workflows that are repository-specific
to avoid things like this happening again.

And just to reinforce Mikel's message: thank you a lot to all of you
involved in the migration. Specially to *Sushain *and *Sardulc*, the effort
you've put into this has been just amazing. Thank you so much.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New release fra-cat

2018-02-20 Thread Xavi Ivars
El dia 20 de febrer de 2018 a les 12:48, Mikel L. Forcada 
ha escrit:

> Awesome! Brilliant! Thanks a million! Merci beaucoup!
> How does it work without CG? Is there a fall-back for apertium-caffeine,
> apertium-omegat, or the Android app?
>

There are a few pieces that won't work in the "java" flavour of Apertium

   - Disambiguation
   - There's a ton of work done in CG that the tagger doesn't solve
  properly by itself. We need to retrain taggers, and we'll do in a future
  (soon) release.
   - Lexical selection
  - There has never been a port of Fran's Lexical Selection modul (
  http://wiki.apertium.org/wiki/Constraint-based_lexical_selection_module),
  and there are almost 500 rules per direction now.

In case of the disambiguation, removing the "module" just makes the
translator to work "worse", but it's 100% usable. But the lexical selection
module needs to be there, as the bilingual translation module no longer
chooses a single translation for a given word, and the next module
(transfer) expects a single one.

So as far as I know, there's no fallback.

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] New release fra-cat

2018-02-20 Thread Xavi Ivars
El dia 16 de febrer de 2018 a les 21:09, Hèctor Alòs i Font <
hectora...@gmail.com> ha escrit:

> Last week Mikel asked for more often releases, so I prepared a new one for
> French and Catalan. Tino, could you pack it, please? This release needs
> python3 (I'm not sure whether this is correctly written somewhere in the
> files).
>
> Next follows a short explanation in Catalan of the improvements.
>

Hi Hector,

I can only say: THANK YOU

-- 
< Xavi Ivars >
< http://xavi.ivars.me >
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


  1   2   3   >