Rule Definition
Efficient parsing of XML documents is more and more critical as XML gets adopted more widely. It is very important to have an efficient way to parse XML data, especially in applications that are intended to handle large volumes. Improper parsing can result in excessive memory usage and processing times that can hurt scalability.
When you use DOM parsing techniques to read an XML, it means that entire XML document will be parsed at one time and load into memory.
So it will time and memory expensive and must be used only when you have to modify XML documents.
Remediation
Use SAX when no modification are made on the document and when used in different threads (session, EJB...), in a loop.
Violation Code Sample
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class DomParserExample {
Document dom;
private void parseXmlFile(){
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder db = dbf.newDocumentBuilder(); // VIOLATION
dom = db.parse("sample.xml");
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch(SAXException se) {
se.printStackTrace();
}catch(IOException ioe) {
ioe.printStackTrace();
}
}
}
Fixed Code Sample
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXParserExample extends DefaultHandler{
[...]
public SAXParserExample(){
[...]
}
private void parseDocument() {
SAXParserFactory spf = SAXParserFactory.newInstance();
try {
SAXParser sp = spf.newSAXParser();
sp.parse("sample.xml", this);
}catch(SAXException se) {
se.printStackTrace();
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
[...]
}
public void characters(char[] ch, int start, int length) throws SAXException {
[...]
}
public void endElement(String uri, String localName, String qName) throws SAXException {
[...]
}
[...]
}
Reference
http://ptgmedia.pearsoncmg.com/images/0131453491/samplechapter/megginson_ch08.pdf
http://www.extreme.indiana.edu/~aslom/exxp/
Related Technologies
JEE
Technical Criterion
Efficiency - Memory, Network and Disk Space Management
About CAST Appmarq
CAST Appmarq is by far the biggest repository of data about real IT systems. It's built on thousands of analyzed applications, made of 35 different technologies, by over 300 business organizations across major verticals. It provides IT Leaders with factual key analytics to let them know if their applications are on track.