1.2. Fruit detection and yield estimation

Participants: E. Gregorio, J.R. Rosell-Polo, J. Arnó, J. Gené, A. Escolà, R. Sanz, J.C. Miranda

Fruit detection and characterization 

The availability of practical and reliable methodologies for in-field fruit detection is essential to obtain accurate yield predictions, to progress in robotic harvesting or to optimize thinning operations, among many other applications. Despite advances in areas such as robotics and computer vision, fruit detection remains a challenge that has to deal with issues such as the identification of fruits occluded by other vegetative organs, or the ability to work under different lighting conditions. In order to minimize the above limitations, our group is working on the development of new 3D fruit detection and localization methodologies through the integration of sensors with artificial vision and / or artificial intelligence algorithms. The detection methodologies used include, among others, 3D LiDAR sensors, depth cameras (RGB-D) and photogrammetric techniques such as structure-from-motion (SfM).

LiDAR 3D sensors are based on the emission of multiple laser beams that simultaneously rotate at high frequency, thus generating a 3D point cloud of the scanned scene (Figure 1a). The intensity of the reflected light makes it possible to differentiate the fruits from other vegetative organs (Figure 1b). In addition, this methodology is not affected by lighting conditions and has the advantage of directly providing the 3D location of the fruit (Gené-Mola et al., 2019b).




Figure 1. (a) LiDAR 3D System (+ GNSS receiver) scanning an apple tree. Source: Gené-Mola et al. (2019a). (b) From left to right: RGB image of the apple tree; LiDAR point cloud; apples detected from the intensity of the reflected signal. Source: Gené-Mola et al. (2019b).

Another strategy to minimize the number of occluded fruits has been to combine LiDAR 3D sensors with the application of forced air flow. As shown in Figure 2, the combination of fruits detected in both scenarios (with and without air) allows to increase the percentage of fruits detected (Gené-Mola et al., 2019a).

Figure 2. The right and left figures show, respectively, tests without forced air and with forced air flow action. It is observed that the effect of the air is different depending on the fruit, causing the disocclusion (red box) or the occlusion (blue box) of the fruit. Source: Gené-Mola et al. (2019a).

RGB-D sensors are devices that simultaneously provide color, depth, and infrared signal intensity data. The information provided by the three channels has been used to train deep neural networks (deep learning) (Gené-Mola et al., 2019c), achieving significantly higher detection values than those obtained in conventional approximations based on RGB images (Figure 3).

Figure 3. Fruit detection results corresponding to color images (RGBp), infrared signal (S), depth (D) and combination of the above (RGBp+S+D). It is observed that by combining the information of the three channels, the number of true positives is higher, while the false positives and false negatives are substantially reduced. Source: Gené-Mola et al. (2019c).

Structure-from-motion is a photogrammetric technique that allows high-resolution 3D point clouds to be reconstructed from photographs taken from different positions (Figure 5). The combination of these models with deep neural networks (deep learning) allows high detection rates (> 90%) with less than 4% false positives (Gené-Mola et al., 2020b). Figure 4 shows an example of fruit detection in images using deep neural networks. Figure 5 shows the results obtained after applying photogrammetry techniques in order to locate the detections of the images in the 3D space. The following link shows an interactive 3D view of these results.

Figure 4. Fruit detection and segmentation results in RGB images using the Mask-RCNN deep neural network. Source: Gené-Mola et al. (2020a).


Figure 5. (top) 3D point cloud generated from original RGB images. (bottom) Detected fruits by applying a neural network on RGB images and projecting the detections into the 3D cloud. 3D display of results: here. Source: Gené-Mola et al. (2020b).

Our research group is currently working on developing new techniques for the in-field characterization of fruits and particularly for fruit size estimation. In addition to being a first-rate quality parameter, producers ’knowledge of fruit size is key to optimizing decision-making and making accurate yield predictions. In this sense, our group has developed two methodologies: one for measuring fruit in 3D point clouds (Figure 6); and another for detecting and measuring fruits using neural networks in RGB-D images (Figure 7).

Figure 6. Apple fruit size estimation in 3D point clouds generated with Structure-from-motion techniques: Gené-Mola et al. (2021).

Figure 7. Apple fruit detection and size estimation using neural networks in RGB-D images. Source: Ferrer-Ferrer et al. (2022).


Fruit yield estimation and advanced sampling techniques

Estimating the potential crop yield at the plot level currently requires the application of sampling methods to provide unbiased and accurate estimates. At the practical level, the methods applied must make it possible that the sampling error (i.e., the difference between the actual value and the estimated value divided by the actual value) does not exceed 10%, with the addition make this possible by using a small number of trees to be sampled within the plot. The fruit count (or fruit load) is the parameter normally used when making these samples in the field, ultimately resulting in an estimate of weight or productivity (kg/ha) that will depend on the species, variety and expected commercial caliber.

Random sampling is a well-known method. However, a certain bias cannot be avoided, and the representativeness of the selected trees during the sampling process is therefore questionable. On the other hand, the achievement of accurate predictions (with low sampling error) usually requires the selection of a large number of trees, and even more so if the plot has significant spatial variability in terms of harvest or fruit load per tree.

In order to optimize sampling in fruit plantations, the Research Group in AgroICT & Precision Agriculture (GRAP) has been testing two advanced sampling methods that make use of ancillary information provided by remote sensors, such as aerial images acquired from airplane or drone. The first method (stratified sampling) uses the Normalized Difference Vegetation Index (NDVI). The classification of the trees according to the NDVI allows to differentiate different zones with different NDVI and potential harvest within the plot. Then it is just a matter of randomly sampling within each area in order to get a more representative sample of the trees in the plot. Applying this methodology, Uribeetxebarria et al. (2019a) have achieved to reduce the sample size by 17%, while maintaining the same sampling accuracy compared to the random method previously referred. Figure 8 shows the effect of stratification when a tree vigour view (NDVI) is available.

Figure 8. Sampling schemes with a sample size of 12 trees or sampling points: (left) simple random sampling, (center) stratified sampling using two NDVI classes (6 trees per class), (right) stratified sampling using three NDVI classes (4 trees per class). Source: Uribeetxebarria et al. (2019a).

The second method (ranked-set sampling) has been applied for the first time in fruit growing by our research group (Uribeetxebarria et al., 2019b). Like stratified sampling, this second method also uses information provided by remote sensors to facilitate the selection of those specific trees that reliably represent the entire plot. The ultimate goal is to develop a sampling method that, in addition to providing accurate yield estimates, allows small sample sizes to be used (Uribeetxebarria et al., 2019a). Thus, applying this second method, our research group has obtained satisfactory estimates of peach fruit load using sample sizes of only 5 trees per plot. The ancillary information finally recommended for better sampling efficiency (tree selection) has been the projected area of the crown. This tree size information is obtained with higher resolution from RGB cameras mounted on unmanned aerial vehicles (Uribeetxebarria et al., 2019b).

Despite these satisfactory first results, the variability inherent in many plantations makes yield estimation a challenge. The issue that remains to be resolved is the possible combination of advanced sampling methods and automatic fruit detection and sizing techniques for a more accurate estimate of the yield, either in terms of fruit load or in terms of potential production in kg or tons per hectare.



Ferrer-Ferrer M, Ruiz-Hidalgo J, Gregorio E, Vilaplana V, Morros JR, Gené-Mola J. 2022. Simultaneous Fruit Detection and Size Estimation Using Multitask Deep Neural Networks. (Submitted)

Gené-Mola, J., Gregorio, E., Auat Cheein, F., Guevara, J., Llorens, J., Sanz-Cortiellaa, R., Escolà, A., Rosell-Polo, J.R., 2019a. Fruit detection, yield prediction and canopy geometric characterization using LiDAR with forced air flow. Comput. Electron. Agric. 168. DOI: 10.1016/j.compag.2019.105121

Gené-Mola, J., Gregorio, E., Guevara, J., Auat, F., Sanz-cortiella, R., Escolà, A., Llorens, J., Morros, J.-R., Ruiz-Hidalgo, J., Vilaplana, V., Rosell-Polo, J.R., 2019b. Fruit detection in an apple orchard using a mobile terrestrial laser scanner. Biosyst. Eng. 187, 171–184. DOI:10.1016/j.biosystemseng.2019.08.017

Gené-Mola, J., Gregorio, E., Rosell-Polo, J.R., 2020a. Cómo la inteligencia artificial nos ayuda a contar manzanas [WWW Document]. Conversat. DOI: theconversation.com/como-la-inteligencia-artificial-nos-ayuda-a-contar-manzanas-130571

Gené-Mola, J., Sanz-Cortiella, R., Rosell-Polo, J.R., Escolà, A., Gregorio, E., 2021. In-field apple size estimation using photogrammetry-derived 3D point clouds: comparison of 4 different methods considering fruit occlusions. Comput. Electron. Agric. 188, 106343. DOI: 10.1016/j.compag.2021.106343

Gené-Mola, J., Sanz-Cortiella, R., Rosell-Polo, J.R., Morros, J.-R.R., Ruiz-Hidalgo, J., Vilaplana, V., Gregorio, E., 2020b. Fruit detection and 3D location using instance segmentation neural networks and structure-from-motion photogrammetry. Comput. Electron. Agric. 169. DOI: 10.1016/j.compag.2019.105165

Gené-Mola, J., Vilaplana, V., Rosell-Polo, J.R., Morros, J.R., Ruiz-Hidalgo, J., Gregorio, E., 2019c. Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities. Comput. Electron. Agric. 162, 689–698. DOI: 10.1016/j.compag.2019.05.016

Uribeetxebarria, A., Martínez-Casasnovas, J.A., Escolà, A., Rosell-Polo, J.R., Arnó, J., 2019a. Stratified sampling in fruit orchards using cluster-based ancillary information maps: a comparative analysis to improve yield and quality estimates. Precis. Agric. 20, 179-192. DOI:10.1007/s11119-018-9619-9

Uribeetxebarria, A., Martínez-Casasnovas, J.A., Tisseyre, B., Guillaume, S., Escolà, A., Rosell-Polo, J.R., Arnó, J. 2019b. Assessing ranked set sampling and ancillary data to improve fruit load estimates in peach orchards. Comput. Electron. Agric. 164, 104931. DOI: 10.1016/j.compag.2019.104931


   Last modification: