$ thought | blog

Place to serialize my thoughts…

Archive for the ‘Articles’ Category

Rewards of “Frequent Check In”

with 4 comments

I have worked with many developers and many times, I ran into following conversation. Whenever we are in the middle of development and I ask for checking in the code, one of following reply is heard -

  • Umm, lets do check-in towards end-of-the day. We are not yet done with the code.
  • Let me clean up these things and make them perfect.
  • I don't want to check-in as I have some ongoing changes and I can't checkin partial files.
  • Lets checkin tomorrow when we complete major functionality.

I am sure you must have heard similar conversation (or had it). Can you give a moment and think about why do are we afraid of checking-in? I have been constantly thinking about this issue and there are many reasons which makes sense on this issue. Few of them are -

Fear of bad code - Developers are afraid of peers criticizing their code. Its GOOD! Believe me, you're never going to write perfect code without receiving feedback about it. Sooner you receive the feedback, better you perform in short time. This is the reason, people don't make their changes public until they are convinced that they can't do any better. It is clearly a false fear.
One of the way to improve and influence your team is to make your work public as it is in progress. This way, you make team aware of your approach. If there is any conflict in approach or disconnect in understanding, it becomes visible sooner and you can resolve it better.

Not sure about impact on other parts of the system - Yes! This is very valid reason. You're in middle of a feature and you're not sure how it will impact rest of the system if you check-in early. Half baked features can take the whole system down. To get over this fear, unit testing your code, running a full build before check-in and having continuous integration makes sure your changes are fail-safe. Make sure that you are providing a working build to the team by writing solid unit tests and more comprehensive integration tests.

Working on multiple parallel streams - It is often a bad practice to work on multiple parallel streams of work. Even worse when you don't commit any of those changes in VCS.
If you're working with DVCS like Git, it is very easy to branch and commit code. If you're working on central VCS like subversion, then it is little tricky to keep your parallel working streams with minimal effort. There are few simple things you can do to make your life little easy -

Use DVCS. This will prove immense value in long run, if you use it properly. There is a learning curve associated with it, but it is definitely worth it.
If you are using SVN, create a branch if it is a big change. If it is small change, evaluate if you can checkin without breaking application. Taking patch is another option too. Use unit testing safety net on your side and make sure you progress smoothly.

In any situation, its not helpful when you're working on parallel streams of work and those changes co-exist. I have often observed people checking in partial or missing content. This happens when they are confused and trying to understand what all files are required to make one of streams live.

Why frequent check in is important?

  • Getting feedback from your application and team.
  • Never lose any of your changes and hard work. Machines do crash and when they do, its bad!
  • Avoid stepping on each others toes, particularly when team is distributed and code churn is faster.
  • Making whole team aware of your approach and remove silo operations. This helps bringing everybody on the same page and triggers conversations whenever required.
  • You can't explain in 30 minutes of speech what code explains in 10 minutes. It eliminates confusion.
  • Discipline of frequent checkin allows you to make sure, changes are not causing any conflicts.
  • Smaller changes, encourages you to refactor and improve code quality.

If you want to read more -

Golden principle of development: Check-in-early, check-in-often.
DVCS introduction from the master: Google Video on Git.
Continuous integration: Wikipedia entry.

Written by Sachin

June 19th, 2010 at 7:53 am

Things to Ponder with Alfresco (Part III: WCM – Improvements)

without comments

Now days using web content management systems (WCM) is trivial. You should continuously update the website content to maintain a fresh look. It increases returning customers and popularity. Almost every major website uses a CMS to achieve this. It might be an in-house developed or an off-the-shelf product. In any case, it helps organizations to separate the content authoring team from the application development cycles.  This helps in quicker content updates without being dependent on release schedules. I am going to point out certain things about WCM feature of alfresco. You can also apply the same principles to your WCM if you are using existing one, or planning to develop a new one!

Content storage format - Alfresco provides a nice feature called "web-forms". There are two types of forms currently available i.e. WCM forms and ECM forms. Both of them capture the input data and store it as XML. You can also choose to apply content transformation using XSL. It helps to save final output as either HTML, Text or PDF etc. Not to forget, the content is actually captured and saved as XML. In my perception one of the strengths of Alfresco is its content repository. If XML is target format, it is saved in CR as a cm:content property and it defeats the whole purpose. If CR is not aware of the structure of the content, it defeats the purpose of content model definition? It is not very different from storing the content xml as plain string in database. Frankly I don't see any benefit of using CR over database in this method.

To avoid such inefficient use of alfresco, you can take some effort and write a custom form component. Use custom forms (develop forms in your application and then communicate with alfresco using API) to capture user data and store them in content repository in defined content model and schema.

This will help you in -

  • Content repository now understands the structure of your content. This is really important and if you have any doubts, please post that in comments. I might write about it later in my blog.
  • You can apply your custom content transformation logic so that content can be exported in XML / HTML / custom output format. Just generate this once every time you update the content. It improves the efficiency of content management system to deliver target content.
  • You will have a flexibility of own input format, use JSP or rails forms, its your choice!

Deployment benefits - Alfresco has another great strength, Deployment Infrastructure. I think its also one of its weaknesses. Use it wisely to improve your release management. On the release date of web application usually you come across content sanity issues, or format differences. To avoid that, setup proper environments where alfresco can sync content and test it before you push it to live. This might save you a lot of frustration.

Although its a great feature, there is a caveat to that. Generally all the code for web application is stored in CVS systems. Alfresco code (web scripts, web form definitions etc.) is stored in alfresco. Keeping the versions of alfresco code and web application code is nightmare. Develop some automated web script deployment tool and every time you release, wipe out all the previous webscripts and deploy new ones from the CVS. This always makes you keep updated code in the CVS and versioning becomes much robust.

Written by Sachin

January 4th, 2010 at 8:48 pm

Posted in Articles,Developer

Tagged with ,

Things to ponder with Alfresco (Part-I: Content Model)

without comments

For any content repository, content model definition is similar to the DDL (schema definition) for databases. It is expected in a big application to have evolutionary content model like databases. Updating the table structures or introducing new tables is common in any business application. In content repository, you should also keep in mind that your content model should be open to evolve. Define the content model which will be extensible. Most of the good practices of database design should be applicable here.

Few tips for better content models -

  • Use aspects for specializing types instead of defining content types. There are few advantages of that, aspects can be applied or removed at run-time which gives you much more control over the dynamic nature of types.
  • Aspects also helps in keeping the default set of properties over a content type to be small. Alfresco recommends you to prefer aspects over custom content types.
  • Aspects helps you to logically group properties. This helps your data to be part of multiple groups at the same time. If you have aspects Publishable and Indexable; you can apply or revoke a nature at your will without much hassle.
  • Indexing is very helpful for searchable content. It is useful but use it wisely. The cost of indexing in Alfresco is much higher than cost of index in Databases as this will be full-text search indexes. It will take up more disk space as well as time to index the data. Don't index the content which people are not expect to search! You can define this at the time of content model definition.
  • Versioning is similar to indexing and use this wisely. The content which is versionable is handled separately by Alfresco. This might be affecting performance as your repository size grows. Don't mark content as versionable if you don't plan to maintain revisions for content.

I will continue this series with other experiences with Alfresco. Later.

Written by Sachin

September 13th, 2008 at 11:51 pm

Linux Myth Buster: What is difference in Server and Desktop Linux?

with 2 comments

Disclaimer: This article is too long! This was done to explain lots of historical information which lead to this question. If you are not interested, you can skip to ‘Actual answer begins here…’ section.

Why this question is here?

Linux has been part of my technical life since 2001. I have seen Red Hat Linux 6.x early releases and fall of Red Hat Linux. Rise of Fedora and birth of Ubuntu. This whole journey is very interesting and enjoying in its own way. Linux is today having a position which is not dominating but surely challenging! Linux is not having aim to replace every desktop in the world (… some might say it just can’t!) but surely its putting a smile on people’s faces who have used it for a while. You can’t just put Linux out of the door and wait to disappear, the small tiny tux is now having a great foot tapping parade of people and organizations which will do whatever they need to make this tux growing!

I have been asked a question many times by my friends and colleagues who are new to Linux or just want to learn Linux. They started the question with, “Sachin, I have a stupid question…” this was always one of questions which I answered very religiously. The only reason I never made any nonsense while answering this question was, they were new to Linux and slight wrong answer might put a wrong foundation about Linux Information in their minds. My friend Ashish suggested me to write a blog post about the same so that, people can get right information.

The question is not at all stupid! In fact, most of people don't know the answer and they hesitate to ask this. I would request you to read the whole post and if you have any suggestion or question feel free to post a comment about it.

In technical terms, Linux is simply a ~2 MB sized binary file which has all (…almost all) the stuff necessary to manage file systems, networks, processes etc. Usually Mandrake, SuSE, Fedora, RHEL and all other popular Linux flavors contain this ~2MB binary acting as heart and soul of their distribution. Remember the italicized word in last sentence; we are going to come back to that word shortly. You can obtain Linux in pure form from Kernel distribution site. That’s most pure and raw form of Linux. If you follow all the instructions and put lots of efforts (although targeted audience is not supposed to do that) you might get a text based console at the max!

Remember the word we italicized? Distribution, yes that is a very important term in this world! Since, Linux started way back in 90s and it was free, people started experimenting with it. People wrote tons of applications which could be run on this robust and naive platform. Most of the applications were coming out of university students, startups, research laboratories and programmers who are dying to try new things! This created a huge sea of application, system, utility software which was unmanageable. Clearly, an effort to make Linux easily adoptable was necessary. This problem lead to a solution called as Distribution. The idea was great because of its simplicity. People started delivering Linux along with few must have software nicely wrapped around with a GUI and installer. This contained system, utility, programming, internet and other software categories. A beginner had all things nicely wrapped which were needed a beginner. Soon, the distribution size exploded from few hundred megabytes to 4-5 gigabytes and the size was growing day by day. Tons of software was coming as part of default bundle which was not at all needed by a normal user. To address this chaos and make Linux more targeted to audience, people started creating their own customized distributions! This lead to Live CD, business, small and large sized distributions! Each distribution was targeted for an audience with considering their need of software and applications. Nothing got changed; still a nice GUI and rock solid base functionality was supported. Each distribution (a.k.a. distro) was addressing a very specific problem. There are around 1000+ distributions available to date at the time of writing this post, refer to (distrowatch).

Actual answer begins here…

Linux was really easy to configure and creating a distribution was real fun, which lead to “Problem of Many”. There were unlimited options and unlimited configurations! This created a great amount of chaos and making impossible for new comers to figure out which one of these 1000+ distributions they can use for day-to-day use or what they should use to install a Linux server. When, any organization was supposed to install a Linux powered web server, a ‘Super’ developer used to spend around a week to either create a really custom version of Linux. This used to be based upon some out-of-box distribution which matches close to their criteria. If this doesn't work out well, organization will pay hefty amount of money to purchase Enterprise Linux from vendors like Red Hat or SuSE. Commercial software vendor were giving 24x7 supports and other business terms which were comforting to organization which were new to Linux. Today or another day this was supposed to happen. (Let’s leave the topic here as this might be a good discussion over Slashdot)

In simple terms, Linux is a simple binary. Distributions are nothing but customized suite of application, system software which is solving some business, personal computing problem for most of the cases in ‘out-of-box’ fashion. Server and Desktop versions of distributions are nothing but different set of software packaged together. Yes, that simple it is! Just another distribution might be called Server Linux and others might be called different names as well.

Ubuntu, Kubuntu are today’s most popular Desktop versions of Linux distributions (from my perspective). Ubuntu as well releases a server distribution with same quality of bundling and support.

What can I expect from a Desktop distribution?

You can expect all the functions (perhaps much more than that) supported by a Installed and Configured Standard Windows box from a Linux Desktop distribution. This includes E-mail client, Internet Browser (Mozilla Firefox etc.), Media Player, Media Browser, Games (basic), Network support, Office and Productivity suite etc. There are tons of other features coming which you just need to explore.

What can I expect from a Server distribution?

If you are planning to host a web server which can run database applications written in PHP or Ruby then perhaps this is the distribution you are looking for. This includes Telnet, SSH, FTP, Web and other remote servers. This includes few server applications which are normally needed by web administrators. In all normal cases you don’t need this distribution as you don’t need most of the applications in your day-to-day basis unless you are geek.

I am still confused, which one is right for me today? Will I miss something if I choose a wrong distribution today?

Hell No! There is no such phrase called as ‘can’t happen’ in Linux world! If you choose a desktop or server or for that matter any distribution, you will be able to install any application which can run on Linux. Let us take example of standard desktop distribution of Ubuntu. If you installed the standard 1-CD installation and then want to host a web server, please go ahead. You can always find very nice and friendly applications like ‘Apt’ and ‘Package Manager’ which will help you to install web servers, database server, FTP or SSH servers without typing a single command! You can just click the mouse, make sure you are connected to internet and choose any software from repository of ~25000 applications available for Ubuntu! Doesn’t it sound simple and great? So there is no problem in beginning with any distribution of your choice. There might be few distributions which may not support ‘Apt’ but they might be having some parallel application which does the same job. SuSE have ‘yum’, Red Hat has ‘up2date’ and others have similar tools. You just need to find tool which runs in your distribution.

Thank you for reading a lengthy post. Please help me if there is anything which needs rectification.

Written by Sachin

August 21st, 2007 at 9:06 pm

Posted in Articles

Tagged with , ,

Java: Not so stupid series

with 3 comments

I always read the "Not so stupid questions" series on Java Today. It was becoming difficult for me to keep the questions list. Sometime, you need the questions list and you don't find it; don't you hate that? First thing came to mind was my blog! After all we are supposed to use these tools to organize our personal information; isn't it? Blogs were invented for that,so I thought to make a list as a post and let others find the entire list in one place.

I googled for stupid questions list but I couldn't find a single page enlisting all of them. :( But this problem is not there anymore. You would like to bookmark this page for your reference ;)

(Not So) Stupid Questions series:

Q 01: Should I try to declare more of my methods to be static?

Q 02: Some side-effects of String equality don't make sense

Q 03: Some uses of the private keyword don't make sense.

Q 04: I have no idea when to create a new package and what should go in it.

Q 05: When should I implement an interface, over inheriting from a parent class?

Q 06: How can you justify Dimension java.awt.Component.getMinimumSize() when Dimension does not implement Comparable?

Q 07: There are some weird Java operators I don't understand.

Q 08: What's the deal with serialVersionUID?

Q 09: I'm attending my first JavaOne. What should I plan on?

Q 10: Other than bundling my classes, what good does a JAR do me?

Q 11: I have a question about a Java feature. Who do I ask?

Q 12: Can I use the 'Java' name in an open-source project?

Q 13: Why do constructors have to start with a call to super()?

Q 14: Why is if (true); considered valid Java syntax?

Q 15: How can a constructor be private?

At the time of writing this post, only 15 questions were available. If any additional questions are appeared, I will update the post.

Written by Sachin

December 23rd, 2006 at 2:58 pm

Posted in Articles,Java

Tagged with