Hi Tim,

>we ran into issues in previous attempts at migration with the large file sizes 
>in our repo

Indeed we did, and over the years I have had thoughts on that.  

Those large files are large ml models, which are (mostly) static, 
replaceable/interchangeable, not always necessary, and in separate resource 
(-res) modules separated from code modules.

When I was a ctakes newby really disliked the separation of code from resources 
by entirely separate -res modules.  Since then, through working on projects 
that use ctakes code but not (huge) resources as dependencies, I have realized 
the wisdom of the modular separation.  In fact, I put a -huge- model in its own 
-res module so that I could <exclude> it from a ctakes-dependent project, 
saving compile (download) time and disk space.  Like you, I don't like to 
"download the internet" with maven   ;^)

Right now we have the ner dictionaries in sourceforge, not the apache repos.  
While this is done for legal reasons it has worked pretty well.

I think that we could maintain an apache SVN repo of -res modules containing 
only huge model files.   I am guessing that we would have to make it a 
"side/sub project" to maintain a separate repo (jenkins build, etc.).   

Anyway, it would give us the freedom to use a github repo for code (and 
non-model resources) without users needing to go through the github large-file 
workflow, which I see as a barrier to entry.

Thoughts?

________________________________________
From: Miller, Timothy <timothy.mil...@childrens.harvard.edu.INVALID>
Sent: Thursday, June 2, 2022 6:21 PM
To: dev@ctakes.apache.org
Subject: Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL] 
[SUSPICIOUS] [SUSPICIOUS]

* External Email - Caution *


My recollection was that we ran into issues in previous attempts at migration 
with the large file sizes in our repo.
Tim


On Thu, 2022-06-02 at 20:55 +0000, Finan, Sean wrote:

* External Email - Caution *



Thank you Gandhi and Richard.


Unless somebody else beats me to it I will perform some research and see what 
approaches can be used and which might be best.  In the end the cTAKES Project 
Management Committee will need to vote for any action as sweeping as moving to 
github.


Sean

________________________________________

From: gandhi rajan <

<mailto:gandhiraja...@gmail.com>

gandhiraja...@gmail.com

>

Sent: Thursday, June 2, 2022 9:02 AM

To:

<mailto:dev@ctakes.apache.org>

dev@ctakes.apache.org


Subject: Re: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL]


* External Email - Caution *



Hi Sean,


If we are sure that the SVN has all the latest changes and active

development is primarily on SVN, then why don't we request a fresh git

repository and push all the changes over there.


More info on

<https://urldefense.com/v3/__https://infra.apache.org/svn-to-git-migration.html__;!!NZvER7FxgEiBAiR_!rXFMCtlZM4NpDPkgzeq-X2pj1rNwzQNTpZkMZXDoYiZKdJp0n4tDY6q9IcsGRPGrA6KhvmouV_1y_txDVok-tGy3dVLaqefQlQ$>

https://urldefense.com/v3/__https://infra.apache.org/svn-to-git-migration.html__;!!NZvER7FxgEiBAiR_!rXFMCtlZM4NpDPkgzeq-X2pj1rNwzQNTpZkMZXDoYiZKdJp0n4tDY6q9IcsGRPGrA6KhvmouV_1y_txDVok-tGy3dVLaqefQlQ$



On Thu, Jun 2, 2022 at 5:52 PM Finan, Sean

<

<mailto:sean.fi...@childrens.harvard.edu.invalid>

sean.fi...@childrens.harvard.edu.invalid

> wrote:


Hi Richard, you bring up a valid concern.


cTAKES Developers:


The Apache Foundation has had an initiative to "move" all projects to

GitHub for some time now.


I don't know much about how this is done.  If anybody out there has

knowledge or experience that they can pass on, please share.


Thanks,

Sean

________________________________________

From: Richard Eckart de Castilho <

<mailto:r...@apache.org>

r...@apache.org

>

Sent: Thursday, June 2, 2022 3:39 AM

To:

<mailto:dev@ctakes.apache.org>

dev@ctakes.apache.org


Subject: Apache cTAKES GitHub mirror is stuck in 2019 [EXTERNAL]


* External Email - Caution *



Hi,


it appears that the GitHub mirror of Apache cTAKES may be stuck.


When I check the svn log of

<https://urldefense.com/v3/__https://svn.apache.org/repos/asf/ctakes/trunk/__;!!NZvER7FxgEiBAiR_!pH7M7eePuLp7ejJW09QaoQOZsyoj1CD8QySUDx79FZmu6CUuooFcB0dk0hJQ7aI7G3Sq3Mz_GzoiL9XZi-zSEw$>

https://urldefense.com/v3/__https://svn.apache.org/repos/asf/ctakes/trunk/__;!!NZvER7FxgEiBAiR_!pH7M7eePuLp7ejJW09QaoQOZsyoj1CD8QySUDx79FZmu6CUuooFcB0dk0hJQ7aI7G3Sq3Mz_GzoiL9XZi-zSEw$


, I can

see activity as recent as May 2022.


However, on GitHub, I can only see stale branches:



<https://urldefense.com/v3/__https://github.com/apache/ctakes/branches__;!!NZvER7FxgEiBAiR_!pH7M7eePuLp7ejJW09QaoQOZsyoj1CD8QySUDx79FZmu6CUuooFcB0dk0hJQ7aI7G3Sq3Mz_GzoiL9Uu2s-59w$>

https://urldefense.com/v3/__https://github.com/apache/ctakes/branches__;!!NZvER7FxgEiBAiR_!pH7M7eePuLp7ejJW09QaoQOZsyoj1CD8QySUDx79FZmu6CUuooFcB0dk0hJQ7aI7G3Sq3Mz_GzoiL9Uu2s-59w$



Wouldn't it be good if the GitHub mirror would be kept up-to-date?


Best,


-- Richard




--

Regards,

Gandhi


"The best way to find urself is to lose urself in the service of others !!!"

Reply via email to