Get One Random SQL Row From A Very Huge Table
Posted in java on August 1st, 2010 by Aditya – Be the first to commentI have a very huge SQL table (5m entries and keep increasing) and want to select one row of entry at random. Typical method of using one SQL statement is apparently so slow. The typical method is by using the below statement by the way.
SELECT <column_name/*> FROM <table_name> ORDER BY RAND() LIMIT 1
while the above statement works fine on smaller table – generally 10000 records and below (depending on your processor speed), when its confronted with huge table, the statement just could not make it. Just to give you some imagination, I start using one of those statement and while waiting for the result, I can go to vending machine, grab a coke and come back, finish half a bottle of coke before stopping the process out of frustration.
After much tweaking and such, I realized that some statement are more size-conscious than the other. The trick is that using several statements which are less size-conscious to gather much needed parameter to assist statement that are more size conscious by increasing its … specifity (TODO: rephrase this paragraph)
I implement these principles using Java and the result can be seen below
public synchronized PhotoBean getRandomPhoto() throws DAOException {
Connection con = null;
try {
con = getConnection();
PreparedStatement pstmt1 = con
.prepareStatement("SELECT max(`id`) AS max_id FROM photos;");
ResultSet rs1 = pstmt1.executeQuery();
//this to to null-proof the logic
int max_id = 22101985;
int randomId = 22; // put some real ID just to make sure
while (rs1.next()) {
max_id = rs1.getInt("max_id");
}
Random randomGenerator = new Random();
randomId = randomGenerator.nextInt(max_id);
PreparedStatement pstmt2 = con
.prepareStatement("SELECT * FROM photos WHERE `id` >= ? ORDER BY `id` ASC LIMIT 1");
pstmt2.setInt(1, randomId);
ResultSet rs = pstmt2.executeQuery();
PhotoBean tmpPhoto = new PhotoBean();
while (rs.next()) {
tmpPhoto.setPhotoId(rs.getString("photo_id"));
//... more
}
//ps: you can use just one pstmt and it will be no problem as its executed separately but i am just too lazy
pstmt1.close();
pstmt2.close();
releaseConnection(con);
return tmpPhoto;
} catch (Exception e) {
//... more
}