Hi everyone,

My name is Teresa (or terrrydactyl if you've seen me on IRC) and I've
been interning at Wikimedia for the last few months through the
Outreach Program for Women[1]. My project, Git2Pages[2], is an
extension to pull snippets of code/text from a git repository. I've
been working hard on learning PHP and the MediaWiki
framework/development cycle. My internship is ending soon and I wanted
to reach out to the community and ask for feedback.

Here's what the program currently does:
- User supplies (git) url, filename, branch, startline, endline using
the #snippet tag
- Git2Pages.body.php will validate the information and then pass on
the inputs into my library, GitRepository.php
- GitRepository will do a sparse checkout on the information, that is,
it will clone the repository but only keep the specified file (this
was implemented to save space)
- The repositories will be cloned into a folder that is a md5 hash of
the url + branch to make sure that the program isn't cloning a ton of
copies of the same repository
- If the repository already exists, the file will be added to the
sparse-checkout file and the program will update the working tree
- Once the repo is cloned, the program will go and yank the lines that
the user requested and it'll return the text encased in a <pre> tag.

This is my baseline program. It works (for me at least). I have a few
ideas of what to work on next, but I would really like to know if I'm
going in the right direction. Is this something you would use? How
does my code look, is the implementation up to the MediaWiki coding
standard?    buttt You can find the progression of the code on
gerrit[3].

Here are some ideas of what I might want to implement while still on
the internship:
- Instead of a <pre> tag, encase it in a <syntaxhighlight lang> tag if
it's code, maybe add a flag for user to supply the language
- Keep a database of all the repositories that a wiki has (though not
sure how to handle deletions)

Here are some problems I might face:
- If I update the working tree each time a file from the same
repository is added, then the line numbers may not match the old file
- Should I be periodically updating the repositories or perhaps keep
multiple snapshots of the same repository
- Cloning an entire repository and keeping only one file does not seem
ideal, but I've yet to find a better solution, the more repositories
being used concurrently the bigger an issue this might be
- I'm also worried about security implications of my program. Security
isn't my area of expertise, and I would definitely appreciate some
input from people with a security background

Thanks for taking the time to read this and thanks in advance for any
feedback, bug reports, etc.

Have a great day,
Teresa
http://www.mediawiki.org/wiki/User:Chot

[1] https://www.mediawiki.org/wiki/Outreach_Program_for_Women
[2] http://www.mediawiki.org/wiki/Extension:Git2Pages
[3] 
https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/Git2Pages,n,z

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to